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ABSTRACT 

Research on instructional improvement interventions 
for college teachers is reviewed and implications for practice are 
briefly considered. Attention is directed to both the methodological 
soundness of the program evaluations and the findings of the studies. 
Interventions to assist faculty to change their teaching attitudes, 
roles or activities are considered. The following methods of 
assessing program impacts are identified: measures pf the Professors 
attitudes, observations of their classroom behavipr, student ratings, 
and measures of students' learning. Systematic research studies on 
the following types of interventions are analyzed in detail: grants 
to support faculty projects, workshops and seminars, practice with 
feedback (microteaching and minicourses) , feedback from ratings by 
students, and concept-based training (protocols). The evaluative 
research is evaluated and charted according to the following 
categories: author/date; purpose; components of design (including 
description of participants, duration, and instrumentation) ; stated 
results; threats to validity; strengths; weaknesses; and confidence 
rating. In addition, each study "is coded according to Campbell and 
Stanley's notation system/ and specific threats. to each type of 
validity are listed. Approximately 115 references are appended. 
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Preface 



This paper attempts to review and synthesize research on interventions 
designed to improve college teaching. Our review is more successful than 
our synthesis, since this relatively small body of research shrinks even 
further after critical analysis is applied. 

Nevertheless., some implications for practice emerge. We hope that 
our critique and particularly our use of "confidence ratings 11 for studies 
will assist investigators to produce better designed studies. We hope that 
the next generation of teaching improvement efforts will be informed by 
these findings and evaluated more effectively than many of the studies we 
describe. 

This paper is a working document in two senses. First, we ask readers 
to suggest to us pertinent studies which we may have missed. Second, we 
solicit conments on our interpretations and findings which will improve 
subsequent discussion of these issues. 



J. L. 
R. J. M. 
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IMPROVING COLLEGE TEACHING: A CRITICAL REVIEW OF RESEARCH 



For more than a decade, the movement for faculty development and 
instructional improvement has been generating projects and programs, research 
reports, conferences, and professional meetings. Agencies on many campuses 
support activities which promise to benefit faculty and in turn to enrich 
the education of students,. This paper describes attempts to assist faculty 
to improve their teaching and critically reviews research evaluating the 
impact of such efforts. 

We have two purposes for this review. First, we wish to assess the 
methodological soundness of these studies and to make suggestions for their 
improvement. Second, we wish to derive implications for practice, i.e., 
what guidance does this research provide for those who plan and administer 
instructional improvement programs? 



Interventions with Faculty 



Interventions to improve instruction take a variety of forms and have 
a variety of purposes. Their users seek to modify institutional climate, to 
restructure the curriculum, to clarify attitudes about teaching and learn- 
ing, to increase knowledge of alternative instructional strategies, to 
introduce technologically sophisticated teaching techniques, to increase 
the clarity of lectures, to improve the quality of examinations, and so on. 
Because faculty members are the agents of instruction, each of these acti- 
vities ultimately requires that teachers change what they do. Our concern 
is with interventions designed to promote such faculty change. 

In this paper, we examine studies which evaluate programs to assist 
faculty as teachers to change their attitudes, roles, or activities. We 
are not concerned here with evaluations of particular instructional tech- 
niques, unless there is also an attempt to change faculty behavior and to 
monitor the success of that attempt. For example, we are not concerned 
with the large literature on the effects of the Personalized System of 
Instruction (for a comprehensive review see Kulik, Kulik, and Cohen, 1979), 
but we are concerned with evaluations of attempts to assist professors to 
become more proficient users of that approach. 

r. . 

Much of the research we have consulted merely documents program acti- 
vities and assesses participants' satisfaction; but some of it assesses 
the relative impact of approaches and is thus potentially more useful in 
program design. While we draw upon descriptive research for illustrative 
purposes, the studies to which we give critical attention are those which 
individually or in combination can inform the choice of alternative inter- 
ventions for teaching improvement. Whether studies are experimental or 
quasi-experimental in design and whether they use qualitative or quantita- 
tive methods is less important than that they be systematically executed 
and completely reported. 



Impact of these interventions may be assessed using data of 
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types supplied both by students and professors. We have identified five 
types of evaluation data and the likely data sources for each. The cate- 
gories begin with the participating professor f s opinions about the activity 
and extend to changes in what their students learn. 

a. Teacher attitude, assessed by self -report 

b. Teacher knowledge, inferred from test or by observer 

c. Teacher skill, recorded by observer or reported by student 

d. Student attitude, self reported 

e. Student skill, inferred from test or recorded by observer' 

The most powerful evidence for an intervention is its impact upon students 
(categories d and e) , and the weakest evidence consists of the self-reported 
opinions of participating faculty members. Yet much of the research we 
reviewed fails to go beyond data collected on the spot from participants 
(categories a and b) . Although we cite in our discussion some descriptive 
research with data in categories a and b, most studies in the appendix are 
those with data of types c, d, and e. In short, in order to be summarized 
in Appendix A, a study includes data other than opinions and attitudes of 
participants gathered during the intervention itself. 

In the sections below, we first describe our procedures for the liter- 
ature search and the criteria for our critical review. Several evaluations 
at the institutional or interinstitutional level are then discussed, studies 
which are primarily descriptive. Next, more systematic research on five 
types of interventions is analyzed in some detail. These types are the 
following: grants to support faculty projects, workshops and seminars, 
practice with feedback (microteaching and minicourses) , feedback from 
ratings by students, and concept-based training (protocols). The final 
section of the paper presents some Implications for researchers and for 
practitioners. 



Procedures for the Literature Search 

* » 

A systematic search was carried out for relevant instructional improve- 
ment research with faculty in postsecondary education. Procedures developed 
with precollege teachers, such as microteaching, are also discussed when 
they hold promise for higher education. The, search was conducted through 
abstract indices, texts, and bibliographies, as< wll as major educational 
and psychological journals. Program officers at public and private funding 
agencies were contacted. Pertinent conference papers were also reviewed. 
In all, more than 100 studies were evaluated for inclusion in this review. 
The papers finally selected for critical attention are summarized in Appen- 
dix A. 

Secondary sources, including review articles, were consulted when the 
body of original research on a topic was very large or if the original study 
could not be obtained. Of course, reliance on secondary sources does not 
permit an evaluation of quality, and where such is the case, it is duly noted 
in the discussion. We are confident that the studies included for final 
review are representative of the research from the mid-sixties to the present. 
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Evaluating the Quality of Studies 

The following categories are used for the summary of studies in Appendix 
A: (a) author/date, (b) purpose, (c) components of design (including design 
code, description of participants , duration, and instrumentation), (d) stated 
results, (e) threats to validity, (f) strengths, (g) weaknesses, and (h) con- 
fidence rating. Several of these categories deserve further elaboration. 

Design code . The design of each study is coded according to the nota- 
tion system used by Campbell and Stanley (1963). In their system, 0 denotes 
a point in time at which data are collected (observation). An intervention 
or treatment is denoted by X. An X in parentheses, (X), represents an 
alternate intervention or treatment unrelated to the major experimental 
questions under study. Its usual purpose is to control for the time and 
attention received by members of the experimental group. If research parti- 
cipants are randomly assigned to groups, the designation' R is used. A hor- 
izontal broken line between groups indicates that they were not randomly 
formed. A vertical broken line means that data gathered before the inter- 
vention came from different persons than data gathered after the intervention. 

For example, consider a stottdy in which students made ratings of their 
instructors' teaching at midterm and end-of-term. Instructors in the ran- 
domly-formed experimental group received their midterm ratings but other 
instructors did not. End-of-term ratings for' the two groups were compared 
in order to assess the effect of feedback from student ratings. 

This example is coded as follows: 

R 0 X 0 
R 0 0 

From this design code, the major features of the study are immediately 
apparent. It can readily be seen that there are two groups, differing in 
that only one received the intervention, X. The R indicates that partici- 
pants were randomly assigned to groups. Data are gathered, 0, from both 
groups before and after the intervention. 

Even quite complex designs are easily comprehended by this notation 
system. ' , . 

Threats to validity ^ Validity refers to the extent to which the pro-, 
positions which express Conclusions^ a study approximate truth. Cook and 
Campbell (1979) discuss four types of validity, each of which asks particular 
questions about the components of an investigation. 

1. Are the independent and dependent variables statistically related? 
This question tests the statistical conclusion validity of a study. 

2 Is the demonstrated, statistical relationship between independent 
and dependent variables a cau*al relationship? This question, which requires 
that we rule out noncausal reasons for the statistical relationship, tests 
the internal validity of a study. 
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^3. Is the demonstrated statistical and causal relationship generaliz- 
able to more abstract constructs? This question requires that the operations 
used to gather data are adequate representations of the constructs under 
investigation; it tests the construct validity of the study. 

4. Does the relationship among constructs generalize to other persons, 
settings, and times? This question moves outside the operations and the 
logic of the study itself to test its external validity . 

For illustration, consider how these types of validity apply to the 
investigation described above in which professors in one group* receive 
ratings feedback from students at midterm. The investigator's stated pur- 
pose is to determine the impact on teaching effectiveness of information 
from students about teaching performance. Statistical conclusion validity 
requires, among other thingS, that measures have adequate reliability and 
that statistical tests have adequate power. Internal validity requires 
that an observed end-of -term difference in ratings between groups be due 
to feedback rather than to some other variable such as a differential 
dropout rate in th£ two groups. Construct validity^requires that the pro- 
cedures used when professors receive "feedback" an^ the questions asked of 
students regarding "teaching effectiveness" are adequate representations 
of those constructs. External validity requires recognition that conclusions 
may generalize only to persons, places, and times like those of the study 
itself. 

A number of specific threats to each type of validity are listed in 
Appendix B. Using this list, we have reviewed each study in order to deter- 
mine the appropriateness of design, plausible alternative explanations to 
claimed results, and the degree of confidence that may be placed in the 
results. Threats pertinent to this review are mentioned as each study is 
outlined in Appendix A. 

This approach to validity flows from the quantitative tradition of 
social science research, and most of the research we discuss has placed 
itself in that tradition. We argue that the last three types of validity 
are appropriate for analyzing qualitative research. Whether data are quali- 
tative or quantitative, the threats associated with internal validity, con- 
struct validity, and external validity must be confronted by all investigators 
who wish to ihake causal inferences.. 

As an example of careful qualitative analysis of a teaching improvement 
project we cite the American Sociological Association's Project on Teaching 
Undergraduate Sociology. Even though that project's evaluation dealt pri- 
marily with national task groups rather than with interventions at -the local 
level, it is notable for its methodological stance. Project evaluators were 
concerned with what they discern as problems imposed by the objective/quan- 
titative tradition. They argue convincingly that, if it is to be useful, 
an evaluation should violkte at least four rules of this tradition: the 
rule of objectivity, the rule of measurable outcomes, the rule of nonreacti- 
vity, and the rule of the scientific report. As participants and as obser- 
vers these evaluators compiled field notes as a basis for portraying and 
analyzing events and for attempting to explain why events occurred as they 
did. An evaluation from the quantitative tradition, they expect, would 
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have proved impossible in light of the project's very broadly stated purposes 
or would have resulted in data of little importance. 

Qualitative 4 methodology does not exempt investigators from an obligation 
to rule: out threats to validity. The issue is not lost on the evaluators of 
the American Sociological Association project. 

It is regrettable that, without experimental evidence, it is not 
possible to attribute causation to the program as the agent of 
change. Since most authorities .on program evaluation agree that 
experimentation is difficult, if not impossible, under program con- 
ditions, little is lost by abandoning the effort. 

Without the support of experimental logic, our efforts to attribute 
causation must rest on plausible explanations which our data fail 
to contradict and appear to support. If we can rule out alternative 
explanations, so much the better. However,, the evaluation cannot 
provide conclusive evidence that the world or any part of it is 
different as a result of the, program. (Deutsbher and Gold, 1979,^ 
p. 135) 

We are not as willing to advocate the exclusive use of qualitative 
methodology in program evaluation as Deutfecher and Gold seem to be. They 
propose that because experimentation is difficult with respect to some pro- 
gram evaluation, little is lost by abandoning such efforts and using quali- 
tative methodology only. We believe that the approaches jointly contribute 
toward ruling out alternate explanations and thus allow us more closely to 
approach causal attributions. The statistical conclusion validity that 
quantitative methodology can provide is important, even if it is available 
for only a few of the many experimental questions of a program evaluation. 
Along with qualitative information, it can provide a more complete picture 
of cause and effect. Campbell (1974) discusses the qualitative-quantitative 
methodological conflict and elaborates the relationship between the two: 

...I have sought to remind ray quantitative colleagues that in the 
successful laboratory sciences, quantification both builds upon and 
is/ cross-validated -by, the scientist's pervasive qualitative know- 
ledge. The conditions of mass-produced quantitative social science 
in program evaluation are such that much pf this qualitative base 
is apt to*be l»st. \ If «£»ar£ tcT be truly scientific, we must rees- 
tablish this qualitative grounding of the quantitative in action 
research. (Can^bell, 1974, p. 30) 

Strengths and weaknesses . In evaluating the quality of each study, 
major strengthslSnd weaknesses have been delineated. Many of them are 
directly related to the validity threats listed for a particular study. 
/For example, "low statistical power" m?y have been noted as a statistical 
conclusion validity threat because a study used a small sample which may 
have contributed to nonsignificant results^ Hence, "small N" would be 
listed under the weakness category. Strengths of a study might be the use 
of randomization or a thorough discussion of its limitations. Only the 
most pertinent strengths and weaknesses are noted in the table; indeed, for 
some studies, none have been specified. 



; Confidence rating. A rating of high, fair, or low has been assigned 
each study to suggest how much confidence should be placed in its results. 
It is difficult to set criteria by which all studies can be evaluated. Some 
factors are more important thap others, depending in large part on the speci- 
fic circumstances of each study. Thus^ the joltings are tentative, meant 
only to suggest the general level of qualitjrof research on a particular 
topic. / 

With respect to design, randomized studies us&ig two or more groups 
have been regarded with greater confidence than studies using one group in 
a pretest-posttest design. Limited generalizability of findings .is discussed 
as an epctBrnal validity threat (e.g. /selection by treatment bia^s) , but 
generally we give more weight to internal than external validity. Nta assign- 
ing the finals rating, however, all threats to validity have been corii^idered. 

Confidence in the findings of a study (our confidence rating) should be 
differentiated from a Judgment of the study's ultimate importance. A high 
quality study jnay deal with a problem of little consequence. Likewise, a 
flawed study may merit attention because it is one of the very' few attempts 
to deal with a problem of significance. 
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Interinstitutional Projects and Campus Agencies 

Our literature search led us to reports of instructional improvement 
projects that involve groups of institutions or that evaluate the full range 
of activities of a campus agency. In most cases these reports consist of 
little more than program descriptions, sometime* bolstered by comments from 
participants. To illustrate evaluations at this level, we present three 
examples. One project brought together several institutions for a coopera- 
tivTventure in faculty development (PIRIT) . The second developed a VjeU 
publication to convey information about teaching improvements <Change Magazine 
Reports on Teaching) . The third assessed a campus-wide faculty development 
program on a particular campus (Memphis State University) . 

The Project on Institutional Renewal through the Improvement of Teaching 
(PIRIT) spent three years fostering collaborative activities which reached 
sixteen colleges and universities. On these campuses, teaching improvement 
programs of varied forms were begun. In some cases, program became ^ odi "* 
in a center. In other cases, existing instructional activities were redesigned 
to provide new roles and experiences for students and faculty. An issue of 
Ne w Direction* in Higher Education is devoted to a description of the project 
and includes a report of its evaluation (Gaff and Morstain, 1978) . 

Evaluation was based oh a questionnaire distributed at the close of the 
project to all faculty at fourteen of the PIRIT schools.. Case studies by 
team members from participating institutions have also been prepared. Those 
who returned completed questionnaires were judged to be .representative of 
an facuUy i/age, field, and rank. Respondents who participated in project 
activiSes (479) were compared with those who had not (442) and were judged 
io bVsimilar in age, field, rank, and profile of interest, and activities, 
tncludinTself -assessed teaching effectiveness. This implies that ^ulty 
reachldVthe project were representative of faculty in general. Since the 
groups had not differed on these items when the survey was given at the start 
of the project, it also suggests that project participation had no impact on 
the particular characteristics measured by these items. 

When asked specific questions about benefits derived from the project, 
faculty gave very positive responses to such items as "contact with inter- 
fst^peSL Mother parts^f the^ institution,- "increased motivation f 
It stimulation for teaching excellence," and "personal growth or renewal. 
£wer but still .positive henef it was indicated for "better relationships 
v£n coUeagues " "skill in using new instructional techniques," *nd "bet- 
ter relationships with students." Rating overall benefit, 33 percent said 
that the/ would recommend project activities to a friend or colleague, and 
61 percent indicated that they were using new techniques or approaches as 
a result of their participation. Most regarded these changes as important, 
in general, those who reported greatest involvement in the project * lso. 
reported greatest benefit. >Most nonpar ticipants, too, were knowledgeable 
about the project and positively disposed toward it. 

We must be cautious about the self-report data employed in this^val- 
4 uation, but the project does seem to have generated considerable "tisfact ion 
atd knowledge among participating faculty. Impacts on faculty skills and 
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6ft students are unknown, and we can make no inferences about the differen- 
tial impact of strategies or about their cost effectiveness. Therefore, the 
findings are not sufficient for confidently deriving principles which would 
be useful in designing future teaching improvement projects for faculty. 

The National Teaching*Project of Change Magazine was a three-year 
effort which produced six magazine-sized reports as its major products. 
Each report dealt with three disciplines, profiling up to 30 professors 
and describing their teaching practices. The project is relevant to this 
review because its declared purpose was to make an impact upon college 
teaching through mass distribution of a publication "which celebrates suc- 
cessful teaching improvements/ 1 

\ An evaluation of four of the reports appeared in the final publication 
in)that series (Francis, 1978) The evaluation employed a variety of methods 
("heuristic evaluation 11 ) and assessed the impact of the reports on a variety 
of audiences including the magazine, itself, the disciplinary associations 
who had selected teachers to be written aboutLr^the professors whose work 
was featured, and readers of the reports^ — fiagazine- staff members were 
pleased by the response to the series—about 50,000 copies of each of these 
first four reports were distributed—but they were disappointed that this 
response did not also increase sales of regular subscriptions. Little 
effect, at least of significant duration, could be documented for disciplin- 
ary associations. Case studies of professors who were profiled revealed 
some positive and some not 7 so-positive effects. 

Regarding the larger audience, a questionnaire survey of readers of 
the reports revealed general satisfaction. Seventy-six percent said that 
they found ideas about teaching in the reports, 25 percent planned to incor- 
porate those ideas, and 16 percent said that they were actually using the 
ideas. Twenty-eight percent indicated their own teaching had improved as 
a result of the reports and, of those, about three out of four were able to 
describe the improvement. 

While a project of this nature and scope is unprecedented, only an 
equivocal judgment of its impact can be made from these data. It is parti- 
cularly unfortunate that since there was only one 11 treatment," the evaluation 
can ask only one question, namely, to what extent did this strategy work? 
If the project had systematically varied media and dissemination techniques, 
their relative impact and cost could have been assessed. We could then ask 
which strategy was useful with whom for what purposed and at what cost, and 
use the findings for subsequent decision making and research. 

On individual campuses, staff or committees charged with teaching 
improvement are typically expected to report on their activities. Accord- 
ing to recent surveys, these reports 'ire likely to include little evaluative 
data. McMillan (1975) found 16 of the 35 faculty development agencies T*hich 
he surveyed had attempted evaluation but only. four of them went beyond fac- 
ulty reactions. According to Centra's (1978) .survey of 756 institutions, 
fewer than one-fifth had attempted evaluation, most using unsophisticated 
designs. A survey of institutions in Ohio (Brown and Inglis, 1978) documented 
evaluation at 14 percent of the four-year colleges and universities and at 
just over half of the two-year institutions. That survey also revealed that 
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institutions with teams participating in a series of statewide conferences 
on instructional development were no more likely than nonparticipants to 
conduct evaluations of their teaching improvement efforts. 

Most of the campus -level evaluations we have seen are limited to user 
reports of satisfaction and dissatisfaction. Despite these limitations, 
such surveys can provide some useful information. For example, Mayo's (1979) 
survey listed each of the objectives of the center at Memphis State Univer- 
sity. He asked users and nonusers how important each objective is, how well 
it is being achieved, what changes if any should be made in it, and how the 
center's performance could be improved regarding this objective. Such res- 
ponses can guide staff who want to know the ostensible preferences of their 
faculty. One provocative finding of this survey was that, in general, both 
users and nonusers rated more highly those center services which are chosen 
freely by faculty, such as production of audiovisual materials, than those 
activities which are initiated by the center and require changes in faculty 
behavior, such as workshops on new teaching techniques. 

From time to time reports are prepared for the purpose of reviewing a 
center f s performance and making decisions about its future. A study of such 
documents, if they could be obtained, would no doubt be interesting. We 
suspect, however, that they might not tell us much about effective program 
features generalizable to other institutions, since they serva^ a purpose --— - 
which is fundamentally political. We do not review such documents here. 

In conclusion, it a^ears that fe?? campus-wide and interinstitutional 
programs for instructional improvement are evaluated with the care necessary 
to permit conclusions which usefully inform program design. 
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Grants to Support Faculty Projects 

Many faculty development agencies, particularly those established with 
external funding, award small grants competitively to faculty who propose 
Soj^cts for increasing their teaching effectiveness. Grants may-purchase 
neeaed material or provide personnel such as proctors, tutors, and clerical 
staff. Consultation with instructional development professionals may also 
be supported. In more generously funded programs, "leased time is given 
or summer salary is paid. Centra's (1978) survey of faculty development 
practices found that 58 percent of the 756 responding institutions (two- 
and four-year colleges and universities) said they had a program of summer 
grants for projects to improve instruction or courses. 

Because of their visibility, grant programs help to create a positive 
image for professional development centers. The awards also lend credibility 
to instructional ideas originating in the faculty. Programs vary in size 
and in purpose. Davis (1979) points out Other distinctions among Programs 
including rfiether or not funds are distributed from a cen " ali "^ S ™ r "* 
whether funds reach many faculty (breadth) or few f acuity (depth) , whether 
the object is institutional change of individual recognition, whether pro- 
posals are evaluated by administrators or by teaching faculty, and what 
criteria govern awards. 

Research on grant programs is needed to answer a number of questions. 
What changes in instruction do project grants produce? Do these changes 
persist' What is the impact of the changes on students? How are such 
programs best organizedTith regard to size and duration of grant as well 
as Sa^acteristics of the project or person funded? How do project benefits 
compare wJtn project costs? Unfortunately, most grant programs are documented 

at f descriptive level. Reports for internal circulation or for funding 
agencies may tell only what awards were made and for what purpose. In addi- 
tionie recipient may prepare an account of how the grant was used. Since 
g^s a^rreceivers of awards are obviously self-interested, data should 
alio fathered from objective observers, from comparison groups of faculty, 
and from students intended to benefit from the program. 

Most reported evaluations of granting programs find that P a J?^P an " 
are satisfied. For example, 70 percent of Centra's respondents whose insti- 
tutions had summer grant programs said they felt the program was effective 
or very effective. 

At the national level, one large scale grant program seeking to affect 
college Jnstrtction is the Institutional Grant Program of the National 
Endowment for the Humanities. From 1971 to "77 ^ ^ lUonlor 
were awarded for development of grants and approximately $8 million for 
Zlot grants. Impact on teaching and learning was one of the evaluation^ 
cri?er?a; "the *Zd to break the molds of custom in teaching and learning 
was identified as most important among the goals of the program. Results 
of evaluation of a samp5Tof grant, according to that criterion was not ^ 
encouraging: 50 percent of the developmental grants and 32 percen t of the 
SlS Sants were judged successful in this regard, 32 percent of the devel- 
opment!! grants iA) percent of the pilot grants were judged partially 



successful! and 18 percent of the developmental grants and 30 percent of the 
pilot grants were judged unsuccessful (Curtis, 1978). Even these estimates 
may be inflated, since judgments were made by site visitors who had no data 
from students. 

A granting program was part of the American Sociological Association's 
Project on Teaching Undergraduate Sociology. Members were invited to sub- 
mit proposals for creating nine experimental programs on any aspect of 
undergraduate education in sociology. During the first year, two proposals 
were recommended for funding by the Association to the Fund for the Improve- 
ment of Post -secondary Education (FIPSE) which was supporting the Association 1 
project. Only one proposal was funded by FIPSE. In the second year, five 
proposals were recommended by the project and none was approved by FIPSE. 
The proposal solicitation program was discontinued in the third year since 
its results did not justify the required resources. 

T n a preliminary draft of their evaluation of this program, Deutscher 
and Beattie (1978) offer several reasons for its apparent failure. First, 
the project had devised procedures for selecting proposals for funding, but 
FIPSE insisted that allocations be governed by its routine review procedures. 
Second, in order for FIPSE to screen proposals, the interval during which 
the Association could solicit and review them was quite short. Third, asso- 
ciation members who submitted proposals were not as skilled in preparing 
proposals as expected. Fourth, the Association's review group was inexper- 
ienced in t:he review task. Fifth, the Association had insufficient resources 
to assist those submitting proposals during the revisions and resubmission 
process. 

Given the innovative nature of this project, it is understandable that 
many of these problems were not anticipated. Deutscher and Beattie caution 
against interpreting this enterprise as a failure, although it assuredly 
did not meet its original objective. They point out that the project dir- 
ector's decision to discontinue the proposal competition after two years 
made resources available to other project activities which had a greater 
likelihood of success. The decision was, therefore, an appropriate adaptive 
response. As a qualitative study, this evaluation is richly suggestive of 
difficulties which may also afflict campus -level granting programs. 

At the state level, an instructional minigrant program has been admin- 
istered out of the Office of the Chancellor of the California State Uni- 
versity and Colleges System. Since 1974 between $200,000 and $300,000 have 
beet* awarded annually. An evaluation reviewing four years of the program 
gathered data from grant recipients, deans, department chairs, local eampus 
coordinators and local faculty senators (almost 1400 persons). Final proj- 
ect reports (N-560) were also examined. 

The resulting report reveals a great deal about the program's public 
relations value. It documents that funded projects were in fact instruc- 
tional in nature and concludes that the overall response has been favorable. 
It also concludes that "local campuses have not developed a formally struc- 
tured and reportable mechanism for evaluating the projects they funded" 
(Bogdanoff, 1979, p. 3). Since the study relies only on "collective pro- 
fessional opinion," no data are available on how the grant proposals were 
implemented or what effects they may have had on students. 
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Like other studies reviewed so far, this study is severely limited 
because its data come solely from grant recipients and those working inti- 
mately with them. No alternative treatments are evaluated, no attempt is 
made to verify independently how grants were implemented, and impact on 
students is not studied. 

A campus-level program has functioned for several years at Michigan 
State University. Grant-making activities of the Educational Development 
Program were evaluated in a survey of persons who received grants for under- 
graduate classroom projects from 1970 through 1975. Grant recipients were 
found to be representative of all faculty in age, rank, College affiliation, 
and self-perception. A factor analysis of questionnaire responses suggested 
that these instructional innovators were of three types: the reward seeker, 
the information seeker, and the dissatisfied maverick. Nearly all recipients 
reported that they were pleased with the results of their work under a grant. 
The innovations developed were reported still to be in use in 81 percent of 
the departments and by 74 percent of the developers. 

All grantees had been asked to submit evaluations of their projects, 
and these reports served as the data base for preliminary assessment of the 
granting program. Of the 98 projects (1970-1974) examined in the study, no 
report had been received for 33, reports containing no evaluation were 
received for 14, and reports including evaluation were received for the 
remaining 51. Most evaluations were impressionistic and only 10 "could be 
considered of high quality" (Davis, Abedor, & Witt, 1976, pp. 97-98). 

This study, therefore, identifies two kinds of information useful in 
evaluating such programs, namely innovator characteristics and grant reci- 
pients' reports. 

At the University of Michigan, Kozma (1978) assessed a program to 
increase the use of instructional technology by faculty. The project awarded 
several faculty fellowships for released time, seminars, and technical assis- 
tance Incidence of use of teaching innovations based on instructional tech- 
nology was assessed before and after the fellowship period. Data were col- 
lected from several groups: faculty fellows (N-10) , chairpersons of university 
departments (N-13) , unsuccessful applicants for fellowships (N-8) , holders 
of instructional development grants (a support program of smaller grants, 
N-25) and a randomly selected faculty comparison group (N-137) . Since a 
given amount of funds can support considerably more instructional develop- 
ment grants than fellowships, comparing these groups provides a test of 
breadth versus depth in a granting program. Kozma found that both groups 
reported significantly increased use of innovations at the second survey 
compared with the first. Fellowship applicants also increased their use 
of innovations (but not significantly) , and chairpersons and general fac- 
ulty did not. These data are limited to self -reports which might be biased 
to please the investigator, but they do suggest that both programs had posi- 
tive effects. There may also have been a predisposition among unsuccessful 
applicants to the program toward adopting innovations. In addition, the 
study presents evidence for diffusion of knowledge about these innovations. 
Fellows kept records of contacts with colleagues during which their projects 
were discussed. When later contacted independently by staff , these colleagues 
verified the conversations but indicated that they themselves had not 
adopted the innovations. 



ERIC 



17 



III-4 



Research like this last study begins to document the impact of grant 
programs with varying features* It appears from this study that less expen- 
sive programs can be effective, but, as Kbzma says, that finding may be a 
product of interaction with the characteristics of these participants. Since 
fellows were more innovative at the time of the first survey than were instruc 
tional development grant recipients, the more intensive (and expensive) pro- 
gram may have been appropriate for them, while a less demanding program was 
appropriate for professors just beginning to consider innovations in their 
teaching. Future studies, building on this one, should develop more reliable 
and sophisticated measures to assess instructional impacts. 

In conclusion, we can say little about variables which would permit 
more intelligent design of granting programs. Such programs have attractive 
face validity, since persons completing a grant -supported project are likely 
to have gained new knowledge and skills. Nevertheless, impact on students 
is uncertain and remains to be studied in relation to specific features of 
particular programs. 
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Workshops and Seminars 

Perhaps the most frequent but least carefully evaluated instructional 
improvement activities are workshops and seminars. These are occasions when 
faculty and prospective faculty gather to discuss or otherwise explore some 
topic related to teaching and learning. The gathering may be an informal con- 
versation over brown bag lunches, a presentation by an off^campus consultant, 
a highly structured weeklong summer workshop, or any number of variations. It 
may or may not involve students and may or may not carry. academic credit or 
financial remuneration. Attendance is sometimes voluntary and sometimes coerced. 

The goals of these gatherings also vary. Purposes include helping fac- 
ulty to get acquainted with one another, stimulating examination of attitudes 
about teaching and learning, generating interpersonal support for teaching 
improvement activities, increasing knowledge about research on teaching, 
developing a shared vocabulary for talking about teaching, mastering specific 
skills for course development or for communicating subject matter or for 
assessing student learning, and so on. 

A number of courses to train graduate teaching assistants have been sys- 
tematically evaluated. Activities for experienced faculty, on the other hand, 
are typically evaluated rather informally by questionnaires distributed at the 
close of the event or soon thereafter. Participants are likely to be asked 
how they felt about the activity and what they learned from it. These com- 
ments, at least as described in reports and published articles, are usually 
positive, but permit no conclusions about impacts which persist beyond the 
event itself. / 

In the discussion below, we deal with two types of workshops and seminars. 
The first type aims at changes in attitude and affect and the second type is 
oriented toward changes in skill. 

Changes in Attitude 

Research findings in social psychology suggest that exposure to diverse 
points of view facilitates attitude change. The likelihood that such changes 
will persist is greatest when persons are confronted by opposing views, that 
is when they grapple with dilemmas which they perceive as relevant, which 
generate emotional involvement, and for which possible solutions can be iden- 
tified (Cook and Flay, 1978). We suspect that discussions meeting these con- 
ditions are what many faculty have in mind when they refer to a "good" dis- 
cussion. 

If one is to understand opposing views, however, mere exposure to them 
may not be sufficient. Tjosvold and Johnson (1977) had college students 
discuss their views about a moral dilemma. Before discussing the dilemma 
with another student (who was actually a confederate of the experimenter) 
some were led to believe the other's views Were the same as their own (no 
controversy condition) and some were led to believe that the other's views 
disagreed with their own (controversy condition). Those not exposed to 
controversy were more confident that they understood the other person s 
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view than those who were exposed to controversy, but a direct measure of 
understanding revealed greater knowledge for those in the controversy than 
the no-controversy condition. 

These findings imply that a faculty group which merely discusses oppos- 
ing views may report inaccurately high levels of satisfaction, while those 
dealing with incompatible views actually represented in the group will learn 
more, at least about those views. Educational research in general supports 
the value of controversy in classrooms for promoting curiosity, problem 
solving, and intellectual growth (Johnson & Johnson, 1979). 

A number of devices are useful for stimulating controversy in groups. 
Surveys may be used to introduce findings which violate group members 1 expec- 
tations. Case studies may pose dilemmas. Role playing may promote identi- 
fication with positions other than one's own. Discussions stimulated by 
videotapes of college classes may also satisfy these conditions. One series 
of such tapes, produced at Northwestern University, is used at workshops 
which aim at attitude examination and change. College Classroom Vignettes 
are discussion stimulus videotapes showing classroom incidents and interviews 
with professors and students. Because the taped segments are brief and are 
presented out of context, they elicit a variety of reactions from viewers. 
Not all of these reactions are compatible, and subsequent discussions must 
confront opposing views. Controversy is heightened if a later segment of 
the tape provides information, such as student comments, which contradicts 
a viewer's reaction to the first part of the tape. Or one may have to recon- 
cile a negative reaction to an event on the tape with the fact that such 
events are typical of one's own teaching. 

In some vignette discussions alternative models of "good" teaching 
emerge. For example, Brock (1976) notes that viewers may contend a parti- 
cular action of a taped teacher is "bad" because (presumably^veryone knows 
that it is bad. Others, however, may judge the action according to its 
effects. Far example, a teacher's interrupt ion_o£ a. .student is -Seea as ha<L 
only if it stifles subsequent class discussion. Thus the viewers must deal 
with contradictions between what might be called the "consensus model" and 
the "effects model" of good teaching. 

Participants in vignette sessions report that the discussions expose 
them to a variety of views. Content analysis of vignette discussions document 
that controversies do occur. That is, discussion moves from general concerns 
and dependence on the leader to free expression of disagreements and relative 
independence from the leader (Menges, 1979)* 

There is no systematic evidence to show that vignette discussions have 
an impact beyond the sessions themselves, although anecdotes indicate that 
some faculty are subsequently motivated to try new teaching methods or to 
have their own classes videotaped. One activity which could build upon these 
video-stimulated discussions is the small, peer-led group. Blumenthai (1978) 
describes one such group in which members shared tapes of their own classes. 
Group activity may become a source of significant interpersonal support 
for continued attention to teaching improvement. Blumenthai points out sim- 
ilarities between such sessions and encounter or consciousness-raising groups. 
Support groups of this kind are potentially powerful vehicles for attitude 
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and affective change. Still another approach involves teams of faculty mem- 
bers who visit one another's classrooms and share their reactions. Sweeney and 
Grasha (1979) describe a large-scale program of this kind which was positively 
evaluated by participants. 

Faculty workshops may also aim, at more complex affective characteristics. 
Goldman (1978) evaluated the College Center of the Finger Lakes (CCFL) Fac- 
ulty Development Program to determine its impact on personal development, 
which was one stated objective of the program. Participants' level of self- 
actualization represented personal development and. was measured by Shostrom s 
Personal Orientation Inventory (POI) . A pretest-posttest design with matched 
controls was used. The six-day CCFL Basic Instructional Workshop consisted 
of instruction in diagnosis of teaching and learning styles, instructional 
methods and techniques, selection of teaching strategies, new instructional 
media and resources, and personal values and life plans as they affect 
instruction. Activities included discussion, role playing, skills training, 
and a series of micro-colleges. Included in the study were 12 college pro- 
fessors who participated in tha workshop and 10 professors matched on age 
and academic division who were not involved in the workshop. Significant 
increases for six of 12 sub-scales (inner directedness, self -actualizing 
values, existentiality, feeling reactivity, acceptance of aggression and 
capacity for intimate contact) were noted for the experimental group while 
no significant changes occurred for the control group. Goldman points out 
that there were no significant differences between groups at the pretest. 
He concludes that his study supports the notion that such faculty workshops 
promote participant self-actualization. 

Goldman's study has been assigned a low confidence rating because the 
nonequivalent control group design allows for a number of plausible explana- 
tions for the findings. For example, local history may have influenced the 
experimental and control groups differently. Resentful demoralization may 
have occurred for those who were excluded from the workshop. Small sample 
size oay have contributed to nonsignificant results. Furthermore, although 
pretest-posttest differences are significant on some of the subscales for 
the experimental group, the differences in absolute values may not be of 
practical significance. Also, only one instrument was used to assess self- 
actualization (mono-method bias). Two strengths of the study are, first, 
that it is based on a clear foundation, the model of faculty development set 
forth by Bergquist and Phillips (1975b). Second, it attempts to assess the 
complex construct of self-actualization rather than limiting itself to 
participant reactions. 
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Changes in Skill 

Seminars on college teaching at, a number of colleges and universities 
provide training for teaching assistants, prospective college instructors 
and inservice faculty. For example, at Northwestern University The Seminar 
on College Teaching examines such topics as selecting course ° b J« c "*««; 
task analysis, presentation techniques, discussion skills, and course eval- 
uation. Each student prepares a "unit of instruction" for classroom use 
As well as affecting one's knowledge of these topics such seminars should 
enhance teaching related skills. Unfortunately, the told^ «a»y 
more descriptions of seminars in college teaching than systematic studies 
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of their impact. The following discussion briefly samples from descriptive 
reports and case studies. Then, systematic studies are reviewed in more detail. 

The case studies and reports detail either university-wide or departmental 
courses for college teachers or teaching assistants. The courses vary in 
length from one term to year-long programs. Some seminars focus on a parti- 
cular theme; others cover a variety of issues. For example, Finger (1969) 
described a two-term graduate seminar entitled, "Professional Problems, 11 
offered to psychology graduate students. The seminar covered such topics 
as employment settings, the history of academic and professional psychology, 
the history of higher education, curriculum alternatives, instructional tech- 
niques, and student rights and responsibilities. Some practical teaching 
experience is arranged for each seminar member. Finger reports that both 
students and faculty have derived benefits from this seminar experience. 

A year- long teaching fellow training program was described by Kapfer 
and Della-Piana (1974). This program includes an orientation workshop fol- 
lowed by several options for developing teaching techniques in the areas of 
proficiency testing, personalized instruction, or student testing techniques. 

Rose (1972) reports a campus-wide program at the University of California 
at Los Angeles for increasing the effectiveness of teaching assistants. Entitled 
"University Level Instruction," the course was taught by Professor W. James 
Popham in the winter quarter of 1969. The overall objective was to help teach- 
ing assistants become competent in planning and evaluating instructional sequences 
Two indicators of success of the course are reported. All students performed 
90 percent or better on the final examination, and a significant shift in 
attitude toward* the criterion-referenced approach was found. 

A pilot project for teaching assistants at the University of Florida 
(Smith, 1974) assessed course impacts on the participants 1 classroom_teaching 
behavior. One objective was to develop the skill of probing, and the material 
for that objective was based on an instructional module used in programs for 
public school teachers. Other topics included new media in higher education 
and the use of a systematic approach to college teaching. Teaching assis- 
tants were assigned supervisors who observed their classroom*, videotapes of their 
teaching, and provided feedback. Eleven of 15 teaching assistants increased 
the amount of time they spent in asking questions of students. Those whose 
questioning time declined had initially spent more time questioning and had 
apparently chosen to develop other skills. It was also found that at the 
seminar's end teaching assistants spent less time lecturing and more time 
responding to students' questions. 

The impact of a seminar or workshop on teachers' skills may be inferred 
from researchers' observations of teacher behavior, from student perceptions 
of the teacher, from gains in student learning, or by some combination of 
these. First we mention two studies limited to researcher-observed changes 
in classroom behavior . Then eight studies are described in which student 
perceptions were gathered. Finally four studies are reviewed where student 
achievement was measured. Some studies in the two latter groups also include 
data from classroom observer*. (Many of these are dissertation studies and, 
for some of them, we have had access only to the abstract. If important 
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information is missing from the abstract, the omission is noted in Appendix 
A, but abstracts have usually been sufficient for deriving a confidence 
rating.) 

Impact on classroom behavior . In one dissertation study of teaching 
assistants' classroom behavior (Murphy, 1972), new teaching assistants in 
chemistry at Ohio State were assigned by a stratified random technique to a 
training group or to a no treatment control group". Training included group 
discussions, microteaching sessions, and classroom observations followed by 
conferences with the observers. To evaluate the training, all participants 
were observed once before training and twice after training; audio recordings 
were made of those classes. Classroom events were coded according to the 
categories of Flanders Interaction Analysis and the Question Category System 
for Science. 

Several post training differences were revealed by analysis of variance. 
Trained teaching assistants were more successful in drawing students into 
discussion. They lectured less, used more praise and encouragement, and 
asked more questions. But there were no differences in the type of question 
asked or in the proportion of correct responses elicited. 

A second dissertation study limited to observations of classroom teach- 
ing skill assessed effects of 10 one-hour seminars for teaching assistants 
in biology at Georgia State University (Rhyne, 1973). Twelve teaching assis- 
tants were observed in the lab for one and one-half hour periods just before 
and just after training. Analyses were made of verbal interaction patterns, 
nonverbal movements, and the types of questions asked. 

After training, teaching assistants spent more time with students, asked 
more convergent and divergent questions (but no more managerial or rhetorical 
questions) , and engaged in more indirect talk. Absence of a comparison group 
and use of weak statistical techniques prevent us from drawing causal infer- 
ences about the observed changes. The findings are suggestive, nevertheless, 
and the study is notable as the only one we have located using a lab setting. 

Impact on student perceptions . Yaghlian (1972) worked with teaching 
fellows in chemistry for his dissertation study. A series of five work- 
shops on a variety of topics were attended by from eight to 15 persons. 
Students of the 15 participating teaching fellows gave higher ratings to 
the course than did students of nonparticipating teaching fellows. Changes 
in attitudes of participants were also discerned. The study had an applied 
emphasis and elements of the program were subsequently adopted by that 
department . 

Costin (1968) assessed the impact of seminar participation on student 
ratings of psychology teaching assistants. Entitled "Principles and Methods 
of Teaching Psychology," the seminar has been taught for some years at the 
University of Illinois (Urbana-Champaign) . During the course, students are 
asked to make a' 30 minute presentation which is then critiqued by seminar 
.participants. A survey of 65 seminar participants indicated that the most 
important course topics in their view related to practical daily work of a 
college teacher and to specific aspects of the following areas: (a) develop- 
ing course objectives, (b) selecting and organizing course content, (c) 
planning and handling teaching-learning situations, and (d) evaluating the 
attainment of course objectives. 
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Two substudies were carried out, comparing teaching assistants who had 
participated in the seminar with those who had not yet enrolled on the fol- 
lowing dimensions: skill, structure, feedback, group interaction, and 
student-teacher rapport. In one analysis, teaching assistants were rated 
by their students. Participants 1 mean ratings were significantly higher 
on rapport. The second analysis was limited to teaching assistants with 
at least two terms "experience. For that group, adjusted mean ratings after 
one semester revealed no- significant differences between seminar partici- 
pants and.nonparticipants; after the second semester, differences favored 
participants on group interaction and feedback. Costin concludes that the 
seminar was reasonably successful in helping teaching assistants to develop 
more positive interpersonal relationships in the classroom. 

At Florida State University, a program for teaching assistants was 
developed in the geology department and subsequently used in the chemistry 
department. Hockett (1972) found that, after participation, teaching assis- 
tants showed less teacher control, more individual interaction, and more 
high-level questions. Attitudes of the students of these teaching assistants 
also are reported to have changed in a positive direction. This is entitled 
a "pilot study 11 and requires cautious interpretation since the sample is non- 
random and apparently there was no control group. 

Teaching assistants in business administration at Arizona State Univer- 
sity participated in Haber's (1973) dissertation study. Twelve teaching 
assistants randomly selected from 19 in the department were in turn assigned . 
at random to three groups. One group received instruction in effective 
questioning techniques, using the Flanders system, and also received feedback 
on their classroom performance. A second group received feedback and no 
instruction. A control group received no feedback/no instruction. At pretest 
teaching assistants were generally found to be "direct teachers" who favored 
a controlling role which limits student participation. After training, there 
were no differences between groups in teachers 9 classroom behavior or in 
their ratings by students. Teachers 1 attitudes, measured by the Minnesota 
Teacher Attitude Inventory, were significantly related to their observed 
behavior. 

In another study, teaching assistants in psychology, both graduates (N*4) 
and undergraduates (N«15), taught weekly seminars in their areas of interest 
as a supplement to faculty lectures (Carroll, 1977). In a post test only 
control group design, teaching assistants were randomly assigned to an exper- 
imental seminar (N-10) or a control seminar (N«9) . All teaching assistants 
were required to attend but were unaware of their group membership and of 
the variables being studied. Th& experimental seminar included scheduled 
readings, individual conferences, at least one individual critique of a 
videotape, an unstructured group meeting, and five formal workshop sessions. 
The contra* seminar was less structured, included less input by the instruc- 
tors, and provided an opportunity to view one videotape alone without a 
critique. 

Experimental and control teaching assistants did not differ on sex, 
grade level, verbal aptitude, cumulative grade point average, major, or 
primary reason for taking the course. Interaction analysis of tapes obtained 
near the end of the term showed that classrooms of the experimental teaching 
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assistants were more student centered than those of control teaching assis- 
tants (p < .06), although experimental classrooms did not show higher levels 
of student talk. As predicted, the experimental group received higher stu- 
dent ratings than controls on the use of objectives (p 4. .07) and on general 
effectiveness of instruction (p < .10). Use of indirect teaching skills was . 
correlated with student ratings (p < .02). 

Less powerful effects of teaching assistant training were found in 
Dalgaar^Ts (1976) dissertation study. Het dependent variables included 
ratings of the teaching assistant (a)" by their students, (b) by experts, and (c) on a 
self-evaluation form. Twenty-two inexperienced and untrained teaching assis- 
tants in economics, business administration, and geography departments at 
the University of Illinois (Urbana-Champaign) were randomly assigned to a 
training group or to a no treatment control (stratified by department). 
Training included six two-hour seminars on topics related to instruction. 
Trainees also individually viewed videotapes of their classes with a trained 
supervisor. 

Experts rated the teaching performance of trained teaching assistants 
higher than that of untrained teaching assistants, butfjio impact was found 
on student ratings or self-evaluations. Participants recommended that the 
program be required for new teaching assistants, and the. dissertation includes 
materials used in that training. 

We have lopated two studies in which faculty members participated. In 
the first, courses at the University of North Dakota College of Nursing were 
rated both fall term and winter term (Kingston and Lacefield, 1979). During 
winter term, faculty participated in the TIPS workshops developed at the Uni- 
versity of Kentucky. Sessions dealt with organizational skills, interpersonal 
conmunication skills, teacher behavior; and evaluation skills, A microteach- _ 
ing component was also included. Over half of the ratings in areas covered by the 
workshops increased significantly from fall to winter for the 29 instructors, 
despite the brief interval between training and winter ratings. No control 
data are reported from previous years or from nonparticipants (since all 
faculty participated) and so it is not possible to estimate the chanpe that 
changes would have occurred in the absence of workshop participation. 

A detailed description of a 10-week workshop for faculty is given by 
Howard (1977). In weekly two-hour sessions, participants developed such 
skills as identifying their own teaching goals, discussing teaching in 
nonjudgmental terms, and consulting with one another about teaching. Mem- 
bers observed one another's classes, and videotapes of their own classes 
were viewed and discussed in the group. Hoyt and Howard (1978) report an 
evaluation of two such eight-member groups at Wichita State University. 
Of 68 faculty who indicated interest in the program, 16 were randomly 
assigned to the experimental groups and 16 to a control group. Students 
in one course taught by each of the 32 participating facility completed a 
course evaluation form at midterm and again at end-of-term. Changes on 12 - 
of the 13 items and on the total score favored teachers in the experimental 
groups. ANCOVA found the experimental groups Significantly higher on four 
of the 13 items and on total score. Because faculty were randomly assigned 
to conditions, the study controls for motivation (within a volunteer group) 
and supports the value of these workshop activities; however, one possible 
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source of bias is that raters, if they noticed the taping and observation 
activities, may have suspected that an experiment was under way. 

In summary, all but two of the studies of seminars for teaching assis- 
tants found changed attitudes of participants* students, particularly with 
regard to students' perceptions- of teachers' classroom performance. The 
magnitude of the impact is small, and of the three studies with experimental 
designs, two failed (Dalgaard, 1976; Haber, 1973) and one succeeded (Carroll, 
1977) in' demonstrating statistically significant impact. The two' studies Of 
workshops for faculty did not investigate participants ' classroom behavior 
but, like the studies with teaching assistants, did demonstrate^ impact upon 
students' ratings. « 

Impact on student learning . Of the four studies which examine impact; 
upon student learning, we first describe a training program for seven grad- . 
uate assistants in introductory economics which "'was conducted during the . 
second term of their teaching (Lewis & Orvis, 1973). Each was responsible, 
for two sections of 25 students; all students also met together for lectures 
by senior faculty. During fall term no training Was available. During win- 
ter term, instructors met for weekly seminars and each was videotaped three 
times, following which about two hours were spent in individual review and 
critique of the tape and of the instructors' ratings from the previous term. 
Student achievement, student ratings of instructors, and the instructors 
classroom behavior Were compared between fall (control) and winter (experi- 
mental) . ' 

Stepwise multiple regression indicated that the .average student of a 
trained teaching assistant scored significantly higher on a standardized test 
of achievement in economics (p < .05). The following variables were also 
significantly associated with achievement: prior knowledge of economics,., 
mental ability and achievement, maturation, sex, .and student evaluations of 
instruction. Student evaluations were significantly more positive winter 
than fall term. Anticipating criticism of the quasi r experimental design, 
the authors argue that results do not represent a practice effect since such 
fall to winter changes had not occurred the year before. 

Thirteen teaching assistants in rhetoric, participating in Hoffman' s 
(1974) dissertation research, were videotaped and completed questionnaires 
and tests at the start of a term. They were divided into two treatment 
groups and a no- treatment control group. Groups one and two reviewed their 
data with an instructional specialist. Group one, in addition," held subse- 
quent meetings with the specialists who provided further suggestions and 
training. After eight weeks, all teaching assistants were again taped and 
again completed the written measures. Videotapes, student evaluations, and 
tests were also analyzed. Measures of teacher behavior and of student atti- 
tude and achievement favored the continuing treatment group (group, one) and 
less so group two, compared with controls. These trends, however, did not 
reach statistical significance. 

Training in both interaction analysis and heuristic questioning was 
investigated with teaching assistants in mathematics in Tubb s (1974) dis- 
sertation study. Eight teaching assistants were randomly selected from 21 
who were teaching a calculus course for nonmathematics i ad nonengineering 
students. Teaching assistants were trained in Flanders Interaction Analysis 
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or in Polya's Heuristic Teaching or in both. Although" the numbers receiving 
each type of training are not reported in the abstract, each strategy appears 
to have influenced classroom behavior of those who were trained, as shown by 
change scores • Students of -trained teachers showed higher achievement and 
problem- solving skill than control students and rated their instructors, even 
higher than their "ideal expectations 11 for teaching ability. 

Eight teaching assistants in the mathematics department at East Carolina 
University were randomly assigned to a group trained in interaction analysis 
or to a group receiving no training in Daniels' (1970) dissertation study. 
Some of the participants were pursuing a degree in mathematics education and 
others were pursuing a degree in mathematics. Flanders categories were 
applied to audiotapes made at several points during the term. Trailed teach- 
ing assistants scored higher on four of the nine categories used itf the 
analysis, indicating greater indirectness and flexibility. The mathematics 
education gxoup scored higher than the mathematics group on six or the cate- 
gories, regardless of training. ^Students of the mathematics education teaching 
assistants scored higher than those of mathematics teaching assistants regard- 
less of training group. Thus, both* tjain^ng and degree objective are influen- 
tial in this study. 

Iri conclusion, the evident from these coursee and seminars spans a 
number of academic fields and suggests that seminar experience can affect 
the achievement of students of trained teacEers as ^ell as affect student ^ 
attitudes and teacher classroom behavior. Not all studies find significant 
differences and dp** all studies avoid important threats to validity, but \ 
such trends are/Well worth Tfcrsuingfc Because they are based primarily on 4 
graduate teaching assistants, their generalizability is limited. Experienced 
faculty may be* unwilling to volunteer and may strongly resist being assigned 
to such activities. Further, teaching experience may interact with program 
activities and thus decrease (or p&rtfaps increase) the impact of training. 

• \ 

Guidelines for Assessing Impact 

Estimates by participants &f their satisfaction and learning, are the 
most common data for evaluating the impact of workshops in which faculty 
participate. There are important problems with relying on such estimates. 
To close this chapter, we refer to the literature in continuing medical 
education for illustrations of these problems. 

One study evaluated intensive instruction (12-20 houts) given to prac- 
ticing physicians in recognizing unknown heart sounds (auscultatory skill) 
(McGuire, Hurley, Babbott, &.Butterworth, 1964). During instruction, heart 
sounds and their visual representations were simulated; participants prac- 
ticed naming the sounds and received immediate feedback. Anonymous eval- 
uations showed that participants felt they had learned a- great deal and 
assessment of their skills showed that, compared with a control group, they 
made significant gains from pretest to posttest. ^ 

Six months later, a representative subgroup of participants was again 
tested. Two results are noteworthy. First, their mean skill score at six 
months was not significantly different from their mean score at pretest. 
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Second, it was expected - that even if "there was a decrement in skill, the 
course mig ht have produced increased sensitivity^ to cardiac findings and a 
consequent increase in the frequency and variety with which cardiac infor- 
mation wis observed and recorded. A comparison of hospital charts completed 
by these physicians before and after the course revealed no differences in 
the amo unt or quality of the cardiac information recorded. 

Assuming that skill-oriented teaching improvement workshops are designed 
in some ways parallel to this one; these findings should caution us (a) against 
accepting end-of -course satisfaction as predictive of long-term learning, 
(b) against accepting end-of-course skill gains as indicating long-term skill 
learning (unless there is opportunity for subsequent practice with critical 
evaluation) , and (c) against assuming that in the absence of changes in per- 
formance a workshop may, nevertheless, produce changes in a general charac- 
teristic such as sensitivity. 

The relationship between self-rated learning and objectively assessed 
learning was also explored in the evaluation of an educational development 
program at Wayne State Univ ;rsity School of Medicine. Fifty- five persons 
participated in two three-hour meetings for each of 12 weeks. The sessions 
covered a variety of topics related to learning and instruction. Partici- 
pants rated their progress on statements expressing^ the objectives of each 
session. Their ratings were generally high and fairly uniform across objec- 
tives, surprising staff ratio had noted considerable variability in actual 
acconqplishment. Further consideration of staff observations and of the 
participants 1 ratings suggested several . conditions which affect the accuracy 
.of participants' estimates: When participants could "engage in free discus- 
sion, when there was a comfortable rapport between teacher and participants, 
when relatively few demands were made on them to demonstrate their skills, 
and when there was little external feedback to them on their performance, 
^ there were uniformly high achievement ratings. When there were clear tests 
of their knowledge and external feedback, ratings of achievement varied 
between people and between objectives and were generally lower 11 (Koen, 1976, 
p. 855). 

These illustrations imply several guidelines for workshop assessment, 
guidelines which are seldom followed in the research on faculty Workshops. 
Both inanediate and delayed tests of ability should be made, but it should 
be recognized that without opportunity for continuing practice with feedback, 
the post -course level of skill mastery is not likely to be maintained. Parti- 
cipant self -assessments, if they are to be accurate, should refer to specific 
behaviors, those behaviors should have been assessed during instruction, and 
participants should have had opportunity to compare their performance with 
an external criterion. Finally, if participant self-assessments are used 
to evaluate sessions which include goals related to attitude change, the 
sessions should include exercises or discussions which insure that partici- 
pants have become actively involved with a variety of views. 



28 



Practice with Feedback : Mlcroteaching and Minicourses 



During the last 20 years, programs which prepare teachers for elementary 
and secondary schools have increased the time during which teaching is actually 
practiced. Expansion of practice teaching in real classrooms accounts for 
some of this increase. In addition, there has been an increase in brief 
teaching encounters focused on behavioral ly specific skills and videotaped ^ 
for subsequent review. One strategy for providing such practice with 
feedback is microteaching. Another involves self-contained instructional 
packages, called minicourses, prepared especially for inservice teachers. 

Both microteaching and minicourses show promise for improving college 
teaching, although most systematic evaluations of their use have been in 
precollege settings. 

Microteaching 

Microteaching, a scaled-down teaching encounter, was originally developed 
for use with preservice elementary and secondary school teachers. It allows 
teachers to learn and practice teaching skills within "micro 11 conditions, 
that is by teaching a five to ten minute* lesson to a small group of approxi- 
mately five pupils. The microteaching process has four steps. First, a 
preservice teacher is presented with a behaviorally defined teaching skill. 
Second, the teacher plans a lesson which incorporates the skill and teaches 
the lesson to a group of approximately five pupils while being videotaped. 
Third, the teacher receives feedback on the lesson from peers and super- 
visor and by viewing the tape. Fourth, the teacher reteaches the lesson to 
another small group of students and incorporates feedback suggestions. A 
variety of skills is usually taught in the microteaching experience, and for 
each new skill this four-step sequence is followed. 

Many elements of the microteaching format are based on research on 
observational learning and behavior modification. For exanqple, Bandura and 
Walters (1963) have studied imitative learning and modeling and their find- 
ings have influenced the microteaching model. Cognitive discrimination 
training, with roots in the behavioral movement, serves to make the teacher 
aware of appropriate teaching behavior. In discrimination training, the 
learner is presented with relevant behavioral instances and then taught to 
discriminate between them. Learning consists of two steps: learning to 
attend to the relevant dimension and then to distinguish between different 
values of this dimension (Wagner, 1973). Iti the microteaching situation, 
teachers learn to discriminate between effective and ineffective instruc- 
tional behavior by viewing samples of their own and others 1 teaching. 

Microterching's underlying component-skills approach requires that 
teacher behavior be broken down into specific components. Emphasis is on 
acquisition of one skill at a t*me. Technical skills that are often taught 
include stimulus variation, fluency in asking questions, and the use of 
higher-order questions. The selection of skills is based on the relation- 
ship between these technical skills and pupil performance (for a comprehen- 
sive review see Turney, Clift, Dunkin, & Traill, 1973, chapter 2). 
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Soma researchers have emphasized the self- confrontation aspect of 
microteaching (Perlberg , Peri, Weinreb, Nit 2 an, & Shimron, 1972; Perlberg, 
Bar-On, Levin, Bar-Yam, Levy, & Etrog, 1974; and Fuller & Manning, 1973) • 
They suggest that mictoteaching provides feedback to prospective teachers 
by causing the teachers to confront themselves. Through self-confrontation, 
the teacher becomes aware of any discrepancy between intentions and out- 
comes. A discrepancy leads to negative feelings_ such as dissatisfaction 
and discomfort. Festinger (1957), in his theory of cognitive dissonance, 
proposes that the reduction of such dissonance is a motivating force in 
individuals, leading to a change in self -perception and/or behavior. This 
suggests that in microteaching, prospective teachers improve their teaching 
skills in order to reduce dissonant feelings produced by the self- confronta- 
tion process. 

Numerous studies investigating microteaching have been conducted with 
prospective elementary and secondary teachers and programs have been set up 
on some college campuses to work with teaching assistants and faculty (for 
example, see Miltz, 1978), but we have located only three systematic studies 
that use microteaching to improve college teaching. Nevertheless, this 
technique appears to be easily adaptable to higher education and we will 
review the major and exemplary studies both at the elementary /secondary 
levels and at the college level. 

We first discuss the earlier studies by relying, for the most part, on 
secondary sources, and then review and critique findings from more recent 
research. Although these studies investigate the relationship between 
microteaching and improved teaching performance, not all of them conceptual- 
ise improved teaching performance in the same way. In some studies, the 
microteaching skills are aimed at improving overall teacher competence by 
concentrating on such areas as lesson planning, discussion skills, and cow- 
trolling techniques and procedures. In other studies, skills are more 
narrowly focused and directed toward developing specific technical skills. 

It should be noted that recently, Hargie, Dickson, and Tittmar (1978) 
have described a variation of microteaching entitled "miniteaching." In 
this variation, 'reteach 1 has been abandoned, integration of skills is 
stressed, lesson length and number of pupils is gradually increased and 
r eme dial sessions are sometimes programmed. We have found no systematic 
studies of this technique, so a critique of miniteaching is not included 
in this review. 

Early studies . After microteaching was developed in the early 1960 f s, 
numerous studies compared it with conventional teacher training methods. 
Allen and Clark (1967), in one of the first studies comparing microteaching 
to conventional student teaching, found microteaching to be more effective 
than student teaching in developing teaching competence. Subsequent studies 
at Stanford did not compare microteaching with conventional methods; rather, 
microteaching was assessed in terms of change in teacher effectiveness occur 
ring from first to last microteaching session. For example, Fortune, Cooper 
and Allen (1967), reported the results of an investigation of the effective- 
ness of the Stanford Micro-Teaching Clinic of 1965. They claimed micro- 
teaching to be effective in improving overall teaching performance, but 
their study has been assigned a low confidence rating because among other 
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problem*, it lacked a control group. A. survey by Ward (1970) of microteach- 
ing in United States elementary and secondary programs noted in Turney, Clift, 
Dunkin, and Traill (1973) reported microteaching to have been generally 
effective in improving teaching competence and developing favorable attitudes 
toward education. Tumey, Clift, Dunkin, and Traill (1973) also have reviewed 
the microteaching literature, drawing similar conclusions regarding the gen- 
eral effectiveness of microteaching. 

Jensen and Young's (1972) methodologically sound comparison of micro- 
teaching withL^conventional methods in developing teaching skills assessed 
teaching performance on three different occasions using the Teacher Per- 
formance Evaluation Scale. Factor analysis identified six performance 
factors: personality traits, warmth of teacher behavior, general classroom 
atmosphere, lesson usefulness, teacher interest in pupils, and teacher inter- 
est in student achievement. Microteaching was found to be significantly 
better than student teaching practice for the first five of these six factors, 
although the superiority of microteaching was sometimes not evident until 
the third observation after about six weeks of teaching. Jensen and Young 
interpret this finding as evidence that the effects of microteaching are 
not temporary and may increase with time. 

Not all studies find microteaching more effective than traditional 
methods. Kallenbach and Gall (1969) found no significant differences 
between the use of microteaching and student teaching. Nevertheless, they 
conclude that microteaching can be considered superior to conventional 
methods because it achieves similar results and requires less administrative 
work and time. This study earns a high confidence rating. 

The relative merits of components of the microteaching process have 
been assessed in several studies. Turney, Clift, Dunkin, and Traill (1973) 
reviewed research findings on six areas of microteaching: (a) attitudes 
toward microteaching, (b) modeling, (c) pupils versus peers in the micro- 
lesson, (d) supervision, (e) feedback, and (f) the teach-reteach interval. 
Their findings include generally positive trainee attitudes toward micro- 
teaching, although some instances of unfavorable attitudes have been noted 
particularly toward the videotape recording. Skill acquisition seems more 
effective when positive models are used, and perceptual models seem to be 
superior to symbolic models. Some skills, however, are Just as effectively 
taught through symbolic models. Discrimination training appears to be an 
important element of microteaching. Several presentations of model behavior 
are superior to a single presentation. Practice in a context similar to 
that of the model enhances learning. School students rather than peers are 
recommended for the microlesson. For feedback to be effective, it should 
be directly related to the model toward which trainees are molding their 
behavior. Videotape feedback appears to ensure the best feedback, parti- 
cularly when it is varied, positive, and specific. Research on the teach- 
reteach interval was inconclusive. 

Hargie's (1977) review of early research on microteaching organized 
the evidence into four categories: changes in teaching performance, pupil 
attitudes toward their teacher, trainee teacher attitudes toward their 
course of training, and increases in pupil learning. He concluded that 
microteaching, as measured by ratings of behavior or by counts of actual 
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behavior, was generally effective in improving teacher performance* Studies 
assessing pupil attitudes toward teaching were rare but generally positive 
results with respect to microteaching were found. With respect to trainee 
attitudes toward microteaching, generally trainees consider microteaching 
to be an effective teacher training tool, Hargie noted that few studies 
had been carried out to investigate increases in pupil learning as a_ result 
of teachers trained in microteaching • However, one study does suggest 
'that pupil learning may vary according to age and subject characteristics. 

Re cent studies . The studies reviewed in this section Sample recent 
research on microteaching alone or on microteaching in combination with 
other techniques. Like the earlier research, these studies for the most 
part favor the microteaching approach. However, three (Johnson, 1977; 
Perlberg, Peri, Weinreb, Nitzan, & Shimron, 1972; Perlberg, Bar-On, Levin, 
Bar-Yam, Lewy, & Etrog, 1974) of the quantitative studies did not include 
control or comparison groups and have been assigned low confidence ratings. 
The elimination of control groups in these studies was sometimes justified 
by earlier studies investigating classroom teachers and showing that 
teacher behavior is remarkably stable from lesson to lesson. Assuming that 
the teaching performance of a group not receiving the intervention would 
remain unchanged, researchers felt no obligation to include control groups. 
However, some studies have found unstable behavior for control groups (e.g., 
Borg, 1975; Perrott, Applebee, Heap, & Watson, 1975). Furthermore, there 
is little evidence from higher education to support the stability of teacher 
behavior. Of the three studies from higher education, two (Johnson, 1977; 
Perlberg, Peri, Weinreb, Nitzam, & Shimron, 1972) did not have control 
groups and were assigned low ratings. 

Among recent studies reviewed here there is evidence for changes in 
teacher knowledge, teacher behavior, and pupil behavior. Wagner (1973) 
compared two methods of influencing the knowledge and teaching skills of 
undergraduates studying distinctions between student -centered and teacher- 
centered teacher behavior. Seventy-eight undergraduates were randomly 
assigned to three groups: Discrimination training, microteaching, and 
control. All participants had 15 minutes to prepare s five minute lesson. 
The discrimination group then received about 30 minutes of training on 
discriminating student -centered from teacher-centered teacher comments: 
they rated 33 taped teacher comments and were given the correct answers 
to each as well as brief explanations. The microteaching group taught the 
prepared lesson, reviewed the videotape of that lesson, and discussed the 
tape and student ratings with a supervisor. They then retaught the lesson. 
The control group merely proceeded to the criterion test. 

On a criterion test immediately after training, trainees in all groups 
prepared and taught a 10 minute lesson to three college students. Video- 
tapes of these lessons were coded according to the six categories of student 
centered teacher behavior used in the training. A week later all students 
completed a test in which they coded a number of teacher comments. On the 
written tests the discrimination group scored significantly higher than the 
control group, but the microteaching group did not differ from the other 
two groups. On the performance test the discrimination group was more 



'JO 



ERIC 



V-5 



student-centered as represented by such behaviors as asking for clarification, 
restating and using student's ideas, than either the microteaching (p <. .01) 
or control group (p < .0005). The microteaching group was not significantly 
more pupil-centered than the control group. The greater student -centered 
behavior of the discrimination group was for the most part due to an increase 
in pupil-centered behavior rather than to a reduction in teacher-centered 
behavior. 

Wagner concludes that it is the discrimination training rather than the 
actual practice in microteaching that results in teacher change and that 
without discrimination training microteaching practice is ineffective. It 
is suggested that the combination of discrimination training and microteach- 
ing might prove very effective. Wagner's study is well designed and executed. 
Such weaknesses as the time lag between the two measurements and the fact 
that the discrimination test may have precluded assessment of whether teach- 
ers learned to attend to relevant dimensions are noted in discussion. Although 
the study is limited in its generalizability to those individuals motivated 
to change and resentful demoralization may have occurred among those in the 
control and microteaching groups, we rate it with high confidence. 

The critical role of discrimination training in the microteaching 
sequence has more recently been discussed by Hargie and Maidment (1978). 
They found a number of studies supporting discrimination training as a 
necessary component in teaching performance. 

Three studies have investigated microteaching with college teachers. 
Johnson (1977) investigated combined training in Flanders' Interaction Ana- 
lysis and training in microteaching labs for producing teacher change in 
interaction behavior, questioning, and reinforcement techniques. Fourteen 
community and junior college professors participated. Analysis of variance 
revealed significant change from pretest to posttest scores for all «*ght_ 
variables measuring teaching performance. All of the changes were 

increases with the exception of teacher talk which significantly decreased. 
Since there was no control group and a small sample was used, many plausible 
alternative explanations exist. It is possible that the volunteer Partici- 
pants were initially motivated to change their teaching behavior and would 
have done so with many kinds of training (Hawthorne effects). Or possibly 
the group improved as a result of maturation. Therefore, a low confidence 
rating has been assigned. 

Perry, LeventhSl, and Abrami (1979) also investigated the effects of 
a variation of microteaching experience on college teachers. The micro- 
teaching experience, called Modified Observational Learning, consisted of 
microteichSg feedback along with cognitive discrimination training. Train- 
ees were asked to role-play four teaching behaviors. For each J***^ 0 *' 
participants were videotaped and provided with remedial feedback until a 
criterion level was reached. Subsequently, the master tape of the four 
videotaped role-play "takes" along with a pretraining tape was given to each 
subject. The subject: was instructed to spend three and one half hours each 
week viewing both tapes as a cognitive discrimination exercise. 
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For the experiment, four graduate students, the "instructors," were 
randomly assigned to either a training or a control group* Within each 
group, instructors were labeled as high or low effective according to 
pretest ratings. Two subsaxnples of introductory psychology students from 
the same introductory psychology course participated. Students from one 
subsample were randomly assigned to four pretraining conditions while 
students rrom the other sub sample were randomly, assigned tcuthe four post- 
training conditions. Thus, separate pre and posttraining samples were 
used. Students completed a questionnaire for assessing teaching effective- 
ness and an achievement measure. 

Findings indicated that training interacted with lecturer differences. 
That is, for initially low effective teachers, there were no differences in 
student ratings and achievement between the experimental and control groups. 
However, for the initially high effective lecturer, higher student ratings 
and achievement scores were reported for those trained by Modified Obser- 
vational Learning. In terms of performance over time for the trainees, 
low effective lecturers showed no change in ratings or achievement from 
pre to posttraining while high effective lecturers 1 student ratings did not 
change but student achievement increased significantly between testing ses- 
sions. In the control condition, the low effective lecturers showed no 
change in ratings or achievement while high effective lecturers 1 ratings 
decreased from pre to posttraining. This study has been assigned fair 
confidence for a number of reasons. Important information relevant to the 
study 9 s conclusions was not included in the brief report such as the dura- 
tion of the experiment and the probability levels used to determine signi- 
ficance; nor were reliability of measures reported. The small number of 
- instructors involved limits generalizability although tills weakness is noted 
by the investigators. Also, graduate students with no teaching experience 
were used as instructors, thereby limiting generalizability to inexperienced 
college teachers. 

Perlberg, Peri, Weinreb, Nitzam, and Shimron, (1972) studied sixteen 
faculty members in dentistry to determine if microteaching techniques 
designed to develop classroom interaction styles and student-centered 
teaching would increase use of such behaviors. They also hypothesized 
that change produced by microteaching would be directly related to a parti- 
cipant's openness: the more dogmatic and authoritarian a participant's 
attitude toward education, the less likely the participant would change. 
All seven skills used to measure teaching performance (lesson organization, 
lecture style, providing examples, fluency in question, probing questions, 
higher order questions, and divergent questions) showed significant improve- 
ment (p < .01) from pretest to posttest. Data also indicated that there 
was greater improvement in questioning skills than in lecturing skills. 
Three measures designed to assess participant's attitudes, the Rokeach 
Dogmatism Scale, the Permissive-Authoritarian Scale and a*bipolar adjective 
scale, as well as attendance at microteaching sessions (perserverance) were 
used to investigate the relationship between attitudes toward openness to 
behavioral change and acceptance of innovation. Only on the bipolar adjec- 
tive scale were scale scores significantly related to post -treatment ratings. 
The best predictor of openness to change and willingness to accept innova- 
tion was perserverance in microteaching clinic sessions. The second best 
predictor was the participant's attitudes toward the microteaching concept 
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and the third best predictor was the participant's attitude toward "dentist." 

t 

This study has been assigned a low confidence rating because it lacks 
an adequate control group. Faculty improvement may have been due to factors 
other than the treatment such as effects of history, the group s prior train- 
ing over two years in teaching improvement activities, and~Hawthorne effects. 

Perlberg and his associates have conducted two other microteaching 
studies with precollege teachers. Perlberg, Bar-On, Levin, Bar-Yam, Lewy, 
and Etrog (1974) investigated the effectiveness of a combination of micro- 
teaching and a computerized feedback system called Technion Diagnostic 
System on the behavior of 60 students in teacher training programs at 
Technion Institute in Israel. This combined technique brought about sig- 
nificant changes in combined scores measuring student-centered teaching 
behavior (nonverbal, not lecturing, relates to) and higher cognitive ques- 
tioning (analytical thinking). For the three student-centered teaching 
behaviors, peak performance was reached at the end of training and post- 
test scores showed a decrease from the last training session. However, 
two plausible explanations are given for this finding: (a) student fatigue, 
and (b) the fact that the posttest lesson was a general lesson not a speci- 
fic skill lesson. This study was assigned a low confidence rating primarily 
because in the absence of a control group we cannot rule out alternative 
explanations for teaching improvement such as history and maturation. 

A workshop utilizing demonstration, discussion, and microteaching to 
develop teacher strategies for increasing independent learning skills in 
pupils was investigated by Kremer and Perlberg (1979) . Changes in both 
teacher and student behavior were assessed. Results indicated that teachers 
in the experimental group talked less and gave less information than control 
teachers. They also asked broader questions and gave more direction. This 
finding is explained as resulting from experimental pupils being involved 
in many activities thus requiring more directions. Significant pupil beha- 
vior changes favoring the experimental group were found for three of four 
variables representing child-centered teaching (responds to teacher, ini- 
tiates talk to teacher, and initiates talk to another pupil) . Increases 
in number of questions and problems raised by students were also noted 
for the experimental group pupils. However, significant differences in 
higher level questions in favor of students taught by the experimental 
group were found for only two of seven variables, divergency and analysis^ 
Kremer and Perlberg point out that there were more changes in classroom 
interaction than in cognitive processes. 

Overall, this study indicates that microteaching can be used to increase 
independent learning skills of pupils. Although the study is well designed 
and the analysis appears appropriate, we have rated it fair because it is 
not clear that random assignment to groups was carried out. Strengths of 
the study include the choice of instruments, its thorough literature review, 
its well-developed theoretical framework, and its inclusion of qualitative 
data. 

In summary, the results of recent studies on the use of microteaching 
indicate that microteaching can be effective in improving actual teaching 
performance. More specifically, it appears that microteaching can develop 
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student reentered teaching behavior. Generally student- centered teaching 
behavior results in less teacher talk and more pupil talk. More question- 
ing goes on and less lecturing is done. Furthermore, microteaching can 
be used to develop higher-order questioning on the part, of teachers and 
students as well as to increase teacher reinforcement skills. 

No significant relationships have been shown between personality 
correlates and microteaching performance or_ microteaching attitude. 

Of particular interest is the finding that discrimination training 
is a critical component of microteaching. Discrimination training is a 
cognitive exercise that is concept-based rather than practice-based. The 
findings with regard to discrimination training suggest that concept-based 
training may be a powerful tool not only for increasing concept acquisition 
but also for increasing skill acquisition. When one considers the lower 
cost of discrimination training in comparison to microteaching and practice 
teaching, one begins to realize the importance of these findings. Parti- 
cularly for the college setting, discrimination training setns more 
feasible than practice-based models. We return to this theme in the later 
discussion of protocol materials. 

Although positive results have been found both for microteaching alone 
and in conjunction with other techniques, a good number of the studies rate 
only low confidence. These ratings are due for the most part to the one- 
group designs which allow for a number of plausible alternative explanations 
for significant findings. 

Microteaching studies conducted with college teachers have seldom been 
well designed. Although the evidence indicates microteaching combinations 
to be beneficial in improving teacher competence, better designed research 
directed at faculty improvement needs to be conducted before conclusions 
may be drawn about which aspects of the technique are effective for improvi g 
what skills for which college teachers. 



Minicourses 

Minicourses are based on the microteaching model and draw upon research 
on technical- ftkills training, modeling, feedback, and film production. Essen- 
tially the minicourse teaches the technical skills of teaching through the 
following process: (a) viewing films of behaviorally defined skills in a 
specific domain of classroom teaching, and (b) practicing those skills within 
a microteaching format. The minicourse differs from simple microteaching 
in that it was designed particularly for inservice teachers, although it 
has been used with preservice teachers as well. The minicourse model allows 
a working teacher to develop needed technical skills in a microsetting and , 
eventually to adapt these skills to a regular classroom. By providing reg- 
ular classroom experience, the minicourse model counteracts the criticism 
leveled against microteaching that acquisition of teaching skills in a 
restricted setting does not necessarily prepare a teacher for regular class- 
room conditions. 

•A 

Minicourse title* include, "Developing Learning Skills," "Tutoring in 
Mathematics," "Thought Questions in the Intermediate Grades," and "Effective 
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Questioning in a Classroom Discussion (Secondary Level)." Minicourse acti- 
vities are integrated into a regular school day, and may be taken by a group 
of teachers in that school, by a pair of teachers who review one another's 
tapes, or even by an individual. The minicourse cycle includes (a) reading, 
viewing films, and planning a lesson, (b) teaching to a small group from a 
regular class, (c) viewing the tape, (d) reteaching followed by feedback. 
Focus is on practice and feedback since "about 10 percent of the course 
involves telling the teacher; 20 percent involves showing him; and the remain- 
ing 70 percent involves allowing him to practice his teaching skills and watch 
replays of his own performance" (Borg, Kelley, Langer, & Gall, 1970, p. 31). 

Although we found a few studies that adapted the microteaching model 
to higher education teaching Improvement, no studies were located that used 
minicourses for improving college teaching. Therefore, minicourse studies 
included in this review were done with elementary and secondary school 
teachers. Minicourses are included because they are highly effective at 
those levels, and because we feel that their format may be viable for use 
with college teachers. Furthermore, since there is evidence that micro- 
teaching at the college level is effective in improving instruction, it 
seems probable that minicourses are ,also potentially effective at that level. 

Developmental studies . Numerous minicourses have been developed by the 
Far West Laboratory for Educational Research and Development. All have gone 
through extensive field testing. Both preliminary and main field tests have 
been conducted for each minicourse. In these tests teachers were videotaped 
in their classrooms prior to the introduction of the minicourse. After com- 
pleting the minicourse, teachers were again videotaped in their classroom. 
Pretest-posttest analyses were made of the videotape. 

For the most part, minicourses have proven to be effective for improving 
the specific technical skills for which each was designed. Further analyses 
have investigated delayed post -course performance, pupil change, and the use 
of the minicourse with different social classes. Revisions were initiated 
when preliminary or main field tests indicated lack of teacher improvement 
on a particular skill. 

Almost all of the minicourse field tests were conducted without con- 
trol groups. This deficien-y in design in addition to other design prob- 
lems threatens the validity of these studies. For example, such effects 
as testing, maturation, and evaluation apprehension may have biased study 
results and conclusions. However, Borg, Kelley, Langer, and Gall (1970) 
anticipate these criticisms and are able to rule out a number of threats. 
For example, it has been impractical for some investigators to find appro- 
priate control groups, and this deficiency allows for a plausible alterna- 
tive explanation of effects; that is, the changes noted for teachers may 
have been due to maturation rather than to the intervention. They note, 
however, three reasons why one would expect a comparable control group s 
teaching behavior to remain stable. First, the average teacher in their 
study had nine years experience, and thus was unlikely to make any signi- 
ficant teaching change without intervention. Second, they cite research 
evidence indicating that classroom teaching behavior is remarkably stable 
from lesson to lesson,. Finally, they cite a study that_ used student 
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teachers as a control group. This control group, which could be expected 
to be ~auch less stable than an experienced group, showed significant improve- 
ment in only two of 12 Minicourse I behavior areas over a two month span. 
In those field tests which did include control groups, little significant 
change was found. 

Other limitations of the field test procedures are also discussed by 
Borg, Kelley, Langer, and Gall (1970) who note that the studies were con- 
ducted with volunteer teachers, and so generalizability is restricted. They 
go on to state that this limitation is not as serious as it first appears. 
Because inservice programs are generally voluntary, the field test data 
would apply to inservice conditions. Furthermore, they cite one minicourse 
as an example where non-volunteers and volunteers were used and changes 
were found for all of them. 

Regarding the possible effects of a videotape recorder in the class- 
room, Borg, Kelley, Langer, and Gall (1970) admit that the equipment might 
contribute to atypical teaching behavior -particularly at the pretest (eval- 
uator apprehension and testing effects). It is also pointed out that the 
equipment might have been serving as a discrimulative stimulus; that is, 
only when the recorder was present were teachers emitting target behaviors. 
They rule out this possibility by stating that it is unlikely that teachers 
would maintain their posttest performance after a four-month interval has 
occurred unless those skills had been practiced during that period. Another 
limitation is the possibility that positive changes noted at posttest resulted 
merely from the teachers 1 awareness at posttest of the target behaviors under 
study. They countered this assertion by noting that only after hours of 
concentrated effort did teachers display the target behaviors and thus, It 
was unlikely that teachers were emitting those behaviors simply because they 
knew which skills were under study. Other findings from two studies con- 
ducted with student-teachers (Borg, 1969) did not find significant differ- 
ences in behavior between a group informed of target behaviors and a group 
that had not been informed. 

As can be seen, although a single field test for one minicourse may 
not have accounted for. all possible threats to validity, the sum total of 
studies that have been carried out to investigate minicourses has for the 
most part ruled out a good many threats. Numerous replications have also 
been conducted. Overall, then, it appears that minicourses do effect posi- 
tive changes in behavior of precollege teachers. 

Recent studies . Aside from these field tests, other studies have been 
made of the basic minicourse model and its effectiveness over an extended 
period of time. Four of these are discussed here. Each has beer, assigned 
either a fair or high confidence rating and each supports the minicourse 
model in improving instructional effectiveness. 

In 1972, Borg studied the effectiveness of Minicourse I ("Effective 
Questioning 11 ) over an extended time interval. The study was designed as a 
three-year follow-up of the effects of Minicourse I. Of the 48 original 
field-test teachers, 30 teachers were still at field test schools and 24 
agreed to participate. No control group was used. At the initial evalu- 
ation of Minicourse I, 11 of 13 target teacher and pupil behaviors showed 
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large and statistically significant improvement. Four months later, teachers 
showed continued improvement in three of the 11 skills that were measured 
and had not regressed significantly on any skill. Approximately three years 
later (39 months), subject performance still remained significantly greater 
on eight of the 10 scored behaviors. Thus, most changes induced by Minicourse 
I persisted over three years. Some behaviors, however, did regress. After 
three years, frequency of one-word student responses increased significantly 
and this frequency was even higher than the precourse mean. Also, teacher 
talk hid regressed significantly; teacher talk had increased from 33 percent 
at the course's end to 45 percent after three years, but was still below the 
initial frequency level of 53 percent. 

Borg's (1972) study has been given a fair confidence rating. It is 
subject to a number of validity threats including testing effects, selection, 
history, and maturation. Several of these threats are discussed; for example, 
he contends that maturation is not a serious threat by citing research showing 
that teacher, behavior remains stable over time, but as we have seen this evi- 
dence is mixed. Problems not ruled out by Borg are the threats of evaluator 
apprehension and mortality. 

Perrott, Applebee, Heap, and Watson (1975) investigated the feasibility 
of transfer of Minicourse I to Great Britain. In a one-group pretest-posttest 
design, they checked for testing effects by randomly assigning participants 
at pretest into two subgroups; one was informed of the target behavidrs 
involved in the study and the other was not informed of the behaviors. There 
were no differences in performance between the groups on the pretest video- 
tape, thus ruling out the possibility that positive posttest changes could 
be attributed to testing effects rather than to the intervention itself. 
The minicourse was effective in producing significant changes at posttest 
on eight of 14 measures. The most important change was the consistent reduc- 
tion in proportion of discussion dominated by teacher talk, a change con- 
current with changes in more specific teaching behaviors. This study is 
thorough and well planned except that it lacks a control group; it serves 
not cnly as a test of information transfer but as a replication of Borg s 
three-year follow-up. Perrott, Applebee, Heap, and Watson (1975), as noted 
above, also offer evidence of mixed results concerning stability of teach- 
ing behavior. 

Buttery and Michalak (1978) also used Minicourse I in a study which 
modified the minicourse format in two ways. First, they devised the Teaching 
Clinic Feedback Process which substituted audio tape for videotape for record- 
ing behavior and providing feedback. The second modification involved a 
naturalistic setting, using regular classroom groups and thus eliminating 
the need for potentially inconvenient special microteaching conditions. 
Further, this study used preservice teachers as its subjects rather than 
inservice teachers. The teaching clinic model was used with one group and 
compared to a control group which received regular student teaching instruc- 
tion. It is unclear whether subjects were randomly assigned to groups. The 
Teaching Clinic Process consisted of (a) lesson planning session, (b) obser- 
vation session, (c) critique preparation session, (d) critique session, and 
(e) clinic review session. Results indicated that preservice teachers who 
completed Minicourse I with these modifications displayed more significant 
changes in teacher behavior than those who received regular student teaching 
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instruction. Eleven of 13 target behaviors changed significantly for the 
experimental group while only two of 13 were significant for the control 
group. A number of design and analysis problems result in. the fair con~ 
fidence rating. Because it is unclear whether randomized assignment was 
carried out, effects of selection-maturation, regression and \t es ting may 
bias the results. < ' 

Collins* study (^978) differs from the one just described in that it 
investigated effects of a minicourse designed by herself and her ' educator 
colleagues rather than by the Far West Laboratory. The target of Collins 1 
minicourse waa teacher enthusiasm. The study focused on two issues i (a) 
whether a minicourse on enthusiasm could increase the level of teacher 
enthusiasm of preservice teachers, and (b) whether the effects of this 
course would be maintained three weeks after the course's end. A pretest- 
post test control group design was used with delayed posttest. Participants 
were preservice teachers rather than inservice teachers. Results indicated 
that the experimental group increased their overall level of enthusiasm 
and also tended to exhibit a greater amount of variance in performance 
during poattests. Ia contrast, control subjects tended to display more 
similar behaviors in enthusiasm during the posttests. The experimental 
group maintained the increased level^of enthusiasm three weeks after the 
mintpourse training while no important differences were evidenced for the 
control group from one test to another. An observable decrease was noted 
for the experimental group from posttest I to posttest II. Collins sug- 
gests that the performance of preservice teachers was leveling after the 
inmediate effects of training and that if tested in another six weeks, the 
experimental group's posttest III scores would not have differed from post- 
test II scores. Collins "supports this explanation by pointing to other 
research with similar results. A high confidence rating has been assigned 
to this study. The investigators attempted to control for a number of 
internal and external validity threats by using observers blind to the 
experimental, conditions, by not informing subjects that they were involved 
in a research project, by usirig random assignment, And by using reliable 
measures. A repeated measures ANOVA was used appropriately. 

In summary, the basic minicourse appears to be highly effective in 
changing teacher behavior. From recent studies it appears that the mini- 
course is a flexible tool that can be modified and adapted in a number of 
ways while remaining effective. For exan^le, the minicourse can be used in 
naturalistic settings and in settings where videotaping equipment is not 
available, or it can be transferred from the United States 

to Great Britain. Minicourse-induced change in instructional effectiveness . 
has been shown to persist over three years. 

More research should be conducted on whether teaching behavior of 
inservice teachers not exposed to such an intervention does indeed remain 
stable, whether videotaping affects teachers so that nbntypical teaching 
behavior is recorded, whether videotaping equipment serves as a discrimina- 
tive stimulus to teachers in these experiments, and whether knowledge of 
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target behaviors at the pretest makes a difference in pretest behavior. In 
view of the apparent effectiveness of the minicourse model with elementary 
and secondary school teachers, research should be extended to college teach 
ers to determine if developing minicourse materials would be cost effective 
at the posts econdary level. 
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Feedback from Ratings bv. Students 



In studies using student ratings to improve instruction, feedback is 
regarded as an impetus for change in teaching performance. These studies 
have included (a) the use of written student rating feedback alone, (b) the 
effects of student rating feedback over time, (c) the use of written stu- 
dent rating feedback with consultation^ (d) the study of discrepan ;ies between 
student evaluations and faculty self -evaluations, and (e) the impact of stu- 
dent rating feedback and student performance. 



Ratings Feedback Alone 

Most studies on written student feedback are conducted in the following 
manner. Rating forms are completed by students approximately three to four 
weeks after the beginning of the term. These ratings are y zed . a ^"" 
ages or percentages are computed for each item and /or dimension. About the 
fourth or fifth week of the term, results are returned perhap s accompanied 
by normative data, to one group of instructors and withheld from others. 
Student ratings are again collected as a criterion measure at the term s end. 
Studies investigate whether mid-term feedbac^ contributes to change in 
rated teacher performance. In this case, no consultation bet 
development specialists and instructors occurs; written student feedback 
results alone are used. 

Twelve studies were located using this approach. The results of the 
«... AilZ <?tx studies found significant positive change in teaching 

rerforLnirbutSrlli^ton 1976; 8 Bledsoe, 1975; Sherman 1978; Braunstein, 
neinft Pachla, 1973; Overall & Marsh, 1976; and Tuckman & Oliver 1968) . 
Three studies found no significant differences between feedback and no feed- 

" ( I " 1973 . Miller 1971; and Rotem, 1978). Three studies reported 
Krf SE£: FleL^ l"ho m as, , 1975; and Murphy . Appal, 1978) or uncer- 
tain (Friedlander, 1978) results. 

Although nine of the 12 studies provide at least some support for 
impact frol student feedback, ' a critical review of the quality of the 
studies indicates that this conclusion may not be warranted Several studies 
finding significant positive change are flawed by design and analysis prob- 
lems For Example, in the study by Butler and Tipton no control 8«^«? 
uSd the sample size was small (H-17 instructors) and conclusions attributed 
to the itndtSs seem premature. The investigators claim that six of 17 
instructors showed significant improvement on post -ratings', but the design 
of Te study does not^ermit us to determine the causes, of these changes. 
Bledsoe's sLdy(1975) also suffers from several methodological problems 
deluding participation of only one instructor and his class in the experi- 
ment ana ?he fac? that the instructor under study was also the investi- 
gator (the threat of experimenter expectancies). 

In Sherman's study (1977-78), two instructors were rated after each 
class meeting StudenL rated the quality of instruction at that meeting 
and the value of the content of that class. They were also asked to give 
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reason* for their ratings. Instructors were not present during data collecr 
tion and were not told the purpose of the research until later in the term. 
The three conditions were no feedback (baseline), feedback in the form of 
average ratings only, and feedback including average ratings, range of 
ratings, and reasons for ratings: Results showed that under the third con- 
dition the ratings of both instructors were significantly higher than during 
baseline. Among the problems of this study &re the absence of a condition 
to control for the reactive effects of testing, dropout of participants, $' 
and lack of parallel data for the two instructors; Nevertheless, the ques- 
tion of optimal level of feedback specificity for affecting teaching is an 
important one, deserving further research. 

Three studies of higher quality favoring student ratings are Braunstein, 
Klein, and Pachla (1973), Overall and Marsh (1976) , and Tuckman and Oliver 
(1968). Braunstein, Klein, and Pachla (1973) compared a feedback condition 
with a no- feedback control condition. Although randomized assignment to 
conditions was carried out, pretest results indicated that the two groups 
were not equivalent at midsemester. The no- feedback group had higher mid- 
term ratings than the feedback group. When changes were analyzed, strong 
positive shifts in evaluations were found for the feedback condition while- 
strong negative changes were noted for the no-feedback condition. Two 
explanations for these results are offered: (a) that feedback contributed 
to the end-of -semester group differences, or (b) that regression toward the 
mean occurred for both groups. The nonequi valence of groups at mid-term 
possible mortality bias have contributed tcTa confidence rating of 
fair for this study. 

Overall and Marsh (1976) sought to clarify the mixed findings of 
earlier studies on student rating feedback. In those studies, including 
one by Marsh, Fleiner, and Thomas (1975), both positive and no-difference 
findings had been shown. The more recent investigation by Overall and Marsh 
found significant differences favoring student rating feedback. This study 
is well designed and executed using analysis of covariance, although unlike 
other studies, the unit of analysis is not instructors but the students who 
filled out the questionnaire. 

Tuckman and Oliver (1968) found significant differences in favor of 
the feedback condition with high school teachers. Although this study is 
well designed, it is questionable whether the use of change score analysis 
was appropriate. Two other studies conducted with high school teachers 
support Tuckman and Oliver's findings (Bryan, 1963; and Gage, Runkel, & 
Chatterjee, 1960). These studies were located in reviews, and so we cannot 
comment on their quality. 

The three studies (Centra, 1973; Miller, 1971; and Rotem, 1978) that 
found no significant differences between feedback and no feedback conditions 
are randomized studies with appropriate comparison groups. Miller notes 
that combining data from various sections of one instructor may have resulted 
in sailing errors due to a small n per cell. The unit of analysis in 
Miller's study was teaching assistants. Rotem (1978) notes that the short 
time interval of his study may have contributed to his no-difference find- 
ings. The Rotem study is unique because-. it was conducted at a research- 
oriented university. 
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A» stated previously, Murphy and Appel (1978) like Marsh, Fleiner, 
and Thomas (1975), offer mixed findings. Murphy and Appel 's feedback con- 
ditions varied slightly from other studies. The design included three 
conditions: no feedback, rating feedback only and augmented feedback. Aug- 
mented feedback consisted of student ratings along with individual perfor- 
mance standards and remedial alternatives reported by each instructor prior 
to the midsemester evaluation. Significant differences for the feedback 
conditions were found/although change score analysis was used. Absolute 
change was small and thus implies little practical significance. In anoj: 
finding, augmented midsemester feedback was no more effective than ^simple 
feedback in improving end-of -semester ratings. 

Instructors in 85 management classes were invited to distribute mid- 
term evaluations to their students (Friedlander, 1978) . As part of an end- 
of-term evaluation, students were asked whether the instructor had distri- 
buted the midterm questionnaire and discussed its results with the class. 
About one-third of the responding students indicated the midterm question- 
naire had been distributed.- The author concludes that students attribute 
change in the course to the midterm evaluation to a greater extent when 
there was adequate class discussion of midterm results than when there was 
not. The report is difficult to follow, however, since it is unclear which 
students were included in subsequent analyses. Because of this and other 
design problems, the study rates low confidence. 

In summary, these studies seem to provide more evidence for .than against writ 
ten student feedback alone, but many of the studies are poorly designed and ana- 
lyzed. Three previous reviews have been conducted of this research. Kulik 
and McKeachie (1975) concluded that research at that time did not support 
differences between feedback and no feedback conditions in improving instruc- 
tion. A more recent review by Abrami, Leventhal, and Perry (1979) states, 
"there seems to be enough evidence to conclude that feedback from student 
ratings leads some instructors to improve their subsequent student ratings. 
However, the effect is not reliable judging from the inconsistency of the 
findings across studies. There are also no reports of the magnitude of 
significant effects so it is difficult to estimate the amount of improve- 
ment which feedback can produce" (p. 361). Rotem and Glasman (1979) in 
reviewing a large body of research on feedback regarding teaching concluded 
that there is a "minimal effect at best of feedback on instructional improve- 
ment at the university level" (p. 497). It will become clear as we proceed 
that our conclusions are somewhat more optimistic than theirs. 

Since most of the studies using student rating feedback involve volun- 
teer subjects, their generalizability is limited. Centra (1973) notes, 
however, that most faculty who use instructional improvement programs are 
volunteers. ^He argues that generalizability is therefore appropriate for 
those most likely to use instructional improvement programs. 



Effects of Ratings Over Time 

For the most part, studies in the previous section investigated the 
effects of written feedback on teaching perfurmance during one term. Two 
studies have investigated the effects of student ratings (without consultation) 



over two or more term* (Centra, 1973; Vogt & Lasher, 1973). Students were 
handed rating forms about the fourth week of the term. These ratings were 
tabulated and provided to the instructors as feedback. The students wefe 
asked to fill out rating forms at the end of that term and successive terms. 
In Centra's study (1973), the effects of rating feedback on teaching per- 
formance was investigated over two semesters. Among the conditions studied 
were: a feedback pre/post condition, a no-feedback pre/post condition, and 
a no-feedback posttest only condition. Interestingly, there were no signi- 
ficant differences among the groups after one semester even when sex, sub- 
ject area, and college teaching experience were taken into account. However, 
an analysis after two semesters based on much smaller samples in each group 
revealed that teachers who had received feedback twice received better rat- 
ings than those who had received feedback once or not at all. Centra's 
study is well designed, earning high confidence. Appropriate statistical 
analyses were carried out and a thorough discussion of plausible explana- 
tions for the study's findings was included. 

Vogt and Lasher (1973), at a college of business administration, also 
investigated the effects of rating feedback on instructional effectiveness 
over time. They analyzed ratings from 26,458 questionnaires for 63 teachers 
over six to eight quarters. All instructors received feedback. Their 
design is quasi -experimental and, hence, not as strong as Centra's. Regres- 
sion analysis indicates that feedback did not contribute to improved teach- 
ing performance over time. 

Since only two studies have investigated the effects of rating feedback 
over time and since their findings are contradictory, we await further 
research to settle this issue. 



Ratings with Consultation 

Personal consultation is sometimes provided along with rating feedback 
tabulations and normative data. Usually, consultations include the inter- 
pretation of ratings and suggestions for improving teaching skills. 

Seven studies investigated the effects of this combination of ratings 
and consultation on instructional effectiveness. All of these studies 
appeared since the Kulik and McKeachie review mentioned above. For the 
most part, they support the effectiveness of a rating/consultation combina- 
tion in improving instructional performance; however, confidence ratings 
vary from low to high for these studies. 

Aleamoni (1978) used a nonequivalent control group design to assess 
the combined effects of consultation and rating feedback over a period of 
one semester to a year later. Therefore, feedback was distributed and con- 
sultations were conducted at least, a semester before follow-up rating forms 
were collected. Ratings of the feedback recipients improved significantly 
on two of five dimensions. Rather than a repeated measures analysis of 
variance, a more adequate strategy might have been a multivariate analysis 
of covariance. Also, Aleamoni does not state whether his analysis adjusted for 
unequal N's. Aside from these problems, the nonequivalent control groups raised 
threats 
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to internal validity such as selection-history and regression. With respect 
to regression, ten subjects were initially dropped from the experimental 
group because they did not qualify for remediation; the experimental group 
then consisted of low scorers. Consequently, this group's final higher 
scores. may be due to regression of their low scores toward the mean. Resent- 
ful de mo ralization may have affected the control group which originally was 
to have consultation, thus inhibiting changes which might otherwise have 
occurred. 

McKeachie and Lin (1975b) studied 37 graduate assistants and three 
faculty members teaching the introductory psychology course at the Univer- 
sity of Michigan. Students completed a 32- item form about one- third through 
the term and again near the end of the term. At a voluntary evening session 
some students also provided data on academic measures, including an achieve- 
ment test in psychology. Instructors were randomly assigned to three groups: 
no feedback (13 sections), printed feedback (13 sections), and personal feed- 
back (14 sections) . 

This report provides a well-detailed description of the personal feed- 
back condition: 

At the beginning of the feedback sessions teachera were 
asked to fill out forms indicating their expectation of 
the student ratings on each dimension, their own self- 
perceptions, and where they would like to be. Typically, 
Professor McKeachie then asked them how the class was 
going and in response to their reactions, suggested how 
the student ratings confirmed (or rarely did not confirm) 
their perceptions. He then pointed out factors on which 
the teacher differed significantly from the mean of all 
classes. If there seemed to be any problems, he sug- 
gested some possible alternative methods of handling 
the problem. All of the mean ratings, however, were 
relatively favorable... so that the hope that he could 
help teachers cope with very negative feedback was not 
realized. (McKeachie and Lin, 1975b, p. 6). 

The group receiving personal feedback was rated significantly higher 
on two general items (overall value of course and general teaching effec- 
tiveness) and on one of the seven dimensions (impact on students) . No clear 
pattern of significant effects on academic measures was found. Among other 
problems, the study suffers from subject mortality, but, particularly because 
of the random assignment of teachers, it does support the value of feedback 
with consultation over feedback alone. 

Hoyt and Howard (1978) report two studies conducted at Kansas State 
University using a combination of computerized rating feedback and consulta- 
tion. One study (Study 1 in Hoyt and Howard, 1978) conpared the first and 
last student ratings of the dame instructor and course that had been taught 
on two different occasions. Results were statistically significant for 13 
of 15 measures, but Hoyt and Howard point out that they were not dramatic 
in the absolute sense. Since no comparison groups were used in this study, 
confidence in the results is limited. Hoyt and Howard replicated this 
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study (Study 2 of Hoyt & Howard, 1978), using a single group, and found 
significant improvement on the objective, "progress on relevant objectives." 
The fact thar significant improvement was not shown for individual objec- 
tives on the rating scale was discounted on the basis that most faculty had 
rated these as irrelevant to the course, A second analysis was conducted 
to examine instructional improvement relative to contact (none, some, much) 
with the office that provided consultation services. When posttest measures 
were adjusted for pretest differences, it was found that rated teaching 
effectiveness increased as a function of amount of contact with faculty 
development services. Bvt our confidence in the findings is low due to its 
nonrandomization and single group design. 

Studies of Erickson and Sheehan (1976) and Erickson and Erickson (1979) 
investigated a combination of rating feedback and consultation offered by 
a Teaching Improvement Clinic. In a well-designed and well-executed study, 
Erickson and Sheehan (1976) compared three conditions: data collection 
only, diagnostic (ratings feedback alone) , and full process (ratings and 
consultation) . Instructor self-ratings and student ratings indicated that, 
overall, the full process members changed no more or less than members in 
the other conditions, although all three groups made positive changes. 
Erickson and Erickson (1979) then designed a study with only two conditions: 
data collection only and full process. Significant differences favored the 
full process group for both student and instructor ratings. As the investi- 
gators were concerned that these findings merely reflected different group 
expectations of change, a follow-up study was conducted to investigate this 
possibility. Differences in performance between semester I and II were 
significant for 11 of 20 faculty members. Erickson and Erickson claim that 
these results show that qualitative changes do occur and are not the result 
of different group expectations. Overall, the Erickson and Erickson study 
earns high confidence, since certain initial weaknesses were tested in a 
follow-up study. 

Two studies have failed to support the ratings/consultation treatment. 
One, Erickson and Sheehan (1976) , was mentioned above. The second, Weerts 
(1978) found no significant differences from midterm to end-of-ter ~or 
two feedback groups (printed feedback and verbal feedback) • A twc actor 
ANOVA with repeated measures on one factor was used. The analysis also 
indicated that there were no significant differences among these groups and 
a no feedback control group at the term's end. Yet, Weerts points out that, 
although no significant differences were found, results show an interesting 
pattern; that is, 20 of 28 items in the verbal feedback group had higher 
ratings than corresponding items in the no feedback group. The chances of 
this occurring were less than five in 100. Similarly, on 23 of 28 items, 
the printed feedback group had higher ratings than the no feedback group. 
The chance of this occurring was one in 1000. Thus, Weerts believes that 
these results indicate that ratings and consultation might improve teaching 
performance. It is important to note that the unit of analysis was classes 
and that graduate teaching assistants taught these classes. This study is 
assigned a low rating because of several analysis problems; a multivariate 
analysis of covariance, for example, might have been more appropriate. 

Reviewing these findings with regard to the quality of studies, we see 
that of the studies that found significant results in favor of this technique, 



three were assigned low confidence, one was given a fair rating, and one 
received high ratings. Although the results are not clearcut, they do 
indicate directions to pursue in further research. For example, even though 
Weerts did not support the effectiveness of this technique in a statistical- 
ly significant way, positive trends were noted in favor of a rating/con- 
sultant approach. 

Instructor-Student Discrepancies 

If there exists a negative or positive discrepancy between the instruc- 
tor's and the students' evaluation of instruction, an imbalance is created 
for the instructor. In order to restore the state of equilibrium, the 
instructor may attempt to reduce this imbalance. Such a prediction may be 
made from social psychological theories such as incongruity theory, dissonance 
theory, and balance theory. Several studies investigating discrepancies 
were located. 

As mentioned above, Rotem (1978) found that feedback did not affect 
subsequent ratings compared with a no-feedback control and a posttest only 
control. He also found that discrepancies (a) between instructors' actual 
and desirable ratings or (b) between students' and instructors' ratings 
were no more effective than midterm ratings alone in predicting end-of-term 
ratings . 

Braunstein, Klein, and Pachla (1973), mentioned above, assessed the 
effects of discrepancies between midterm perceived performance (as rated by 
instructors) and actual performance (as rated by students) on end-of-term 
evaluations. They concluded that when an instructor's expectancy was dis- 
crepant from students' ratings for a trait, a subsequent shift in the dir- 
ection of the instructor's expectancy for that trait is likely. The strength 
of the relationship between discrepant expectation and change in ratings 
was .77 (phi coefficient) . 

In Pambookian's 1974 study, it was postulated that moderately rated 
instructors would improve more than those rated favorably or unfavorably. 
Based on his results, Pambookian claimed that the initial level of student 
evaluation strongly influenced the instructor and that moderately rated 
instructors improved more than favorably or unfavorably rated instructors. 
In a later study (1976), Pambookian hypothesized that the greater the dis- 
crepancy between student ratings and instructor self-rating, the greater 
the Improvement after feedback for those instructors. It was found that 
unfavorably discrepant teachers improved on skill, feedback, rapport, 
general teaching ability, and overall value of course more than the favor- 
ably discrepant. The minimally discrepant improved significantly on one 
dimension, rapport, as compared to the favorably discrepant and showed 
strong trends in the same direction on skill. The least gain was made by 
the favorably discrepant. Pambookian's studies earn low confidence for 
several reasons. The sample sizes were small (N-13) and no control group 
was used. Statistical analysis appears to have been inappropriate. For 
example, change score analysis was used with nonequivalent control groups. 
Furthermore, when an analysis of variance did not reveal significant dif- 
ferences on certain skills, t-tests were used (inappropriately) to investi- 
gate differences between groups. 
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Centra's 1973 study, mentioned above, also investigated the effect of 
discrepant ratings. It is well designed with a multi-institution sample. 
Centra hypothesized that student feedback would produce change in instructors 
who had rated themselves more favorably than their students had rated them 
(unfavorably discrepant group as defined by Pambookian) . The analysis t 
generally supported this conclusion: five of 17 items showed significantly 
higher scores for the unfavorably discrepant group compared with the favor- 
ably discrepant group. Thirteen of the 17 items showed trends in that 
direction. 

Twenty-eight instructors at the University of Michigan participated in 
a study of the effects of feedback discrepancies on subsequent ratings 
(McKeachie & Lin, 1975a) . A 32-item questionnaire with seven dimensions was 
completed by students about one-third through the term and two weeks before 
the end of the term. Instructors also completed the form once as they 
expected to be ratsd by students and once as they "would like to teach" 
(ideal). All teachers received their ratings as feedback. For analysis 
teachers were blocked into eight groups depending on the discrepancy between 
student ratings and various combinations of expected and ideal self-ratings. 
On two of the seven questionnaire dimensions (group interaction and feedback), 
significantly improved ratings were found for those whose expected and ideal 
ratings were higher than student ratings. The group which was rated more 
highly by students than by themselves changed in a negative direction (on 
feedback dimension only) . This pattern of changes and other trends in the 
data suggest to the authors that the discrepancies may raise (or lower) 
faculty motivation and thus affect behavior. - 

In sunmary, the findings of these studies suggest that instructors 
who rate themselves more favorably than their students are more likely to 
improve their teaching performance as a result of student rating feedback 
than those who rate themselves less favorably than their students. . 

As a final study dealing with discrepancies, we cite one in which 
instructor self-rating was used as a dependent variable. (In the studies 
cited above, student ratings constituted the dependent variable.) Oles 
and Lencoski (1973) investigated whether an instructor's own self-rating 
of his course and teaching would be affected in any way by receiving the 
results of students' evaluations. In this study, 24 graduate level instruc- 
tors were assessed using a pretest-posttest control group design. All 
subjects were asked to complete a self-rating form 2 weeks prior to the 
end of the course. In addition, in 12 of the subjects' classes, students 
were asked to fill out a course/ instructor evaluation form. These forms 
were analyzed and the results were returned to each instructor along with 
another self-evaluation form that the faculty member was requested to return 
as soon as he reviewed the student evaluation results. Instructors for the 
other 12 classes served as a control group and received no feedback but did 
complete a second self -evaluation form. The study's findings indicate that 
while the test-retest correlation coefficient for the control group was .82, 
the correlation for the experimental group was .54 suggesting according to 
Oles and Lencoski that receiving student evaluations did haye some influence 
on the instructors 1 self-rAting. A chi square test on the total number of 
changes regardless of direction of change was significant. Changes in self- 
ratings were not all in the direction suggested by student, evaluations. 
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Effects on Student Performance 

The relationship between the use of student rating feedback and stu- 
dent performance has also been investigated*. The assumption underlying 
studies that used student achievement as an outcome measure is the following: 
if student rating feedback does improve instruction, that improvement should 
be evident in student performance. As we have seen, McKeachie and Lin (1975b) 
did not find clear effects of feedback on student achievement. Three other 
studies have investigated this notion. Both Miller (1971) and Marsh, Fleiner, 
and Thomas (1976) found no overall significant differences between feedback 
and no feedback groups on student achievement exam scores. Miller s study 
has been assigned a high confidence level, and we regard this aspect 
of Marsh, Fleiner,. and Thomas 1 study with a fair level of confidence. 

Overall and Marsh (1977) conducted a similar study a year later and 
found that students and faculty who received ratings feedback with consulta- 
tion scored significantly 'higher and noted greater interest in taking future 
coursework in the subject area than students of instructors in a no- feedback 
condition. Their analysis may be regarded with a fair level of confidence. 
Based on their findings and the previous contradictory findings, Overall and 
Marsh recommend additional research on this issue. It is our recommendation 
as well. 

To conclude this chapter on student ratings, we are pleased to note the 
relatively large number of studies although we are disappointed with their 
variable quality. The clearest finding concerns discrepancies between the 
instructor's self -rating and ratings -by students. This discrepancy appears 
to be an effective predictor of who will benefit from ratings feedback. Feed- 
back has its greatest impact on those whose self-ratings are more positive 
than the ratings made by their students. 

The most pressing topic for further research, in our opinion, is the 
relative effectiveness of written feedback alone compared with written feed- 
back plus consultation. Either written feedback alone or written feedback 
plus consultation has been shown by most studies to be superior to no feed- 
back. Only three studies (Erickson & Sheehan, 1976; Weerts, 1978; McKeachie 
& Lin, 1975b) directly compared written feedback alone with feedback plus 
consultation, and only one of them (McKeachie & Lin, 1975b) found clear sup- 
port for consultation as more effective. Since consultation is an expensive 
activity, it is important to learn for which faculty it is most useful. Greater 
attention should be given in this research to instructor variables such as 
motivation and self-other rating discrepancies. 
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Concept-Based Training : Protocol Materials 



Protocol materials are film or videotape recordings which illustrate 
educationally relevant concepts* Developed for precollege teachers, they 
also show promise for postsecondary education. Protocols are designed to 
link educational theory to the teaching process. Generally, a single pro- 
tocol module focuses on a set of related concepts. For that reason proto- 
cols are considered to reflect a confcept-based model of teacher education 
in contrast to mi c rote aching and minicourses which reflect a practice-based 
model. 

Protocol training is carried out in the following manner. Teachers 
are provided with written materials and films which describe and illustrate 
the concepts. They learn how to apply the concepts through a sequence of 
visual illustrations, written exercises, and tests. To illustrate protocols, 
we describe materials produced at Indiana University entitled, "Concepts and 
Patterns in Teaoher-Pupil Interaction." There are ten films in the series. 
Concepts basic to the series are introduced in three films, "Questioning: 
Reproductive and Productive," "Probing and Informing," and "Approving and 
Disapproving" (six concepts in all). Each film is seven or eight minutes 
long and provides classroom examples of the concept. Six films show class- 
room episodes to be analyzed according to the target concepts, thus providing 
practice in interpreting classroom behavior. Each of these films is approx- 
imately eleven minutes long. The tenth film, 35 minutes in length, is used 
as a performance test. It includes 30 brief scenes to be categorized accord- 
ing to the target concepts. Protocol materials are aimed at producing con- 
cept acquisition in users, facilitating skill acquisition, and (by inference) 
promoting desirable changes in the students of teachers who have been trained. 

The protocol idea, materials protraying behavioral events relevant to instruc 
tional concepts, was first proposed by Smith (1969). In 1970, the Sureau of 
Educational Personnel Development of the Office of Education funded a number 
of projects at universities throughout the country. Partly because of the 
funding arrangements, more work has gone into development of the materials 
than into evaluation. In his survey of protocol module evaluations, Cooper 
(1975) notes that conpared to the number of protocols produced, relatively 
few have been adequately evaluated. Cooper summarizes evidence from 73 
sources on the effectiveness of protocols in improving teaching. He reviews 
these studies with respect to four issues: teacher skill acquisition, teacher 
concept acquisition, reactions to protocols, and pupil outcomes. Of this 
research, only one study was identified showing that protocol modules could 
"change on-the-job teacher behasrtor (Borg and Stone, 1974), and this study 
is discussed below with Borg's other protocol studies. Cooper also identi- 
fied a number of studies conducted at Utah State University, Michigan State 
University, Far West Laboratory for Educational Research and Development, 
and Indiana University. For the most part these studies found positive 
results for concept acquisition by preservice and inservice teachers. Fur- 
thermore, Cooper indicated that teachers generally had positive reactions 
to protocol materials. However, Cooper notes an absence of research showing 
impact on pupil behavior. 

Since Cooper f s 1975 survey, we have identified additional studies of 
protocol's effects on teacher and pupil behavior carried out primarily by 
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two groups, by^org and associates at Utah State University and. by Gliessman, 
Pugh, and associates at Indiana University. Nctte of these Studies ittvesti- 
gated protocol materials, at the college level, but they are included here 
because of the potential adaptability of the technique to postsecondary edu- 
cation. 



Concept Acquisition 

Several of the Indiana studies investigated users 1 reactions to pro- 
tocols and alternative ways of structuring protocols. ^ <- 

Gliessman and Pugh (1976) studied concept acquisition and teacher and 
student reaction to the protocol on teacher-pupil interaction. Generally, 
use of the protocols resulted in significant gains in acquisition of concepts 
basic to the series, and teachers and students reacted favorably to the series. 
The experimental design also allowed for checking Effects of pretesting on 
posttest results, and pretesting did affect posttest scores. This study 
rates fair confidence, primarily because the comparison group also received 
the protocols intervention. 

Gliessman and Pugh (1978b) explored the instructional rationale of 
protocol material. More specifically, they were interested in determining 
what * conponents of a protocol sequence were necessary for and effective in 
producing concept acquisition. Teacher-pupil interaction protocols were 
used in two studies to compare a number of instructional treatments. For 
examp le; one group received names of concepts only, tfiile another group 
received concept names and concept definitions. A third group received 
concept n ames, definitions, and filmed exemplifications. A fourth group 
received a combination of concept names and filmed exemplifications. 
Gliessman and Pugh concluded that receiving concept definitions alone did 
not yield effects equivalent to those achieved through the exemplifications 
— of defined concepts; exen^lification contributed significantly to concept 
acquisition. We view this study with fair confidence. Such problems as 
selection-history biases in study one and the use of a probability level ... 
of .081 preclude a high confidence rating. 

Another study by Gliessman and Pugh (1978a) also investigated concept 
acquisition of teachers trained with the teacher-pupil interaction protocol. 
Its distinctive purpose was to investigate the effect of protocol films of 
contrasting structure on the acquisition of teacher behavior concepts and 
reactions to the filmed treatment. Three separate studies were carried out 
with preservice and inservice teachers enrolled in a graduate level educa- 
tional psychology course. Significant gains in concept acquisition were 
found for groups viewing high or low structure films but no significant 
differences were found between these two film treatments. When high struc- 
ture, low structure, and a high/low structure combination were compared, 
significant increases in concept acquisition were found for all three 
groups. A comparison of means revealed significant differences between 
the high- and low-structure groups favoring the low-structure group. A 
third substudy investigated the contradictory results of the first two 
substudies— the finding of both significant and nonsignificant differences 
between groups trained by high-structure films or low-structure films. When 
teacher discussion wa« controlled for, no significant differences were found 
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between group, trained by low or high structure. This study rates high 
cogence it is well designed and appropriate statistical analyses were 

used. 

These studies verify the effectiveness of protocol materials for con- 
cept acquisition. The amount of structure in the films may vary without 
reducing learning, but learning is enhanced when concepts being taught are 
exemplified as well as defined. 



Skill Acquisition 



Other studies have assessed the impact of protocols on the classroom 
skills of teachers. Gliessman, Pugh, and Bielat (1979a) ^ e8ti Sf ed '° n ; 
cent acquisition for the protocol on teacher-pupil interaction. One group, 
ItFJSSEeSl training group, received protocol training. A second group, 
Se annate group^servea as the control group and received "udent coun- 
seling tSninl. Mean concept acquisition scores and -an skill acquisition 
scores were significantly greater for the group trained with the protocol 
module. The correlation between concept -^^^-^^^ Leniency 
(df - 8, p - .08) and the investigators conclude that mean skill frequency 
scores tend to increase with increasing levels of incept acquisition^ 

correlation, however, is rather low and may in part be due to low 
statistical power. A larger sample size might produce a higher correlation. 
A number of design and interpretation weaknesses have led to our lo. confi- 
A numoer « » r clear whet . ier randomization 

^carried* out "nus it is possible that the two groups operated under 
different hiswkcfl circumstances and hence, that significant differences 

IE ™.uX of selection-history biases. Second, differences between 
III group" ror con" P t acquisitio/are statistically significant but their 
practical significance is uncertain. 

focal criterion for both concept and skill acquisition. Thre~. different 



tion. 



No significant relationship was found between skill concept acquisi- 
tion scoref and Sill frequencies. However, "ainees^itten responses, 
provided subjective evidence of both conceptual andn ominal outcomes of 
protocol training. No observational effects were found in trainees written 
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responses. Trainees 1 written responses also indicated that ability to 
apply the concept of probing was positively and significantly related to 
the frequency of probing. However, written responses indicated that train- 
ees 1 accuracy in dealing with probing concept characteristics was unrelated 
to skill acquisition. The data did appear to confirm previous findings 
that protocol training leads to concept acquisition. 
o 

Gliessman, Pugh, and Bielat (1979b) has been assigned a fair confidence 
rating. Although we applaud its intended purpose of replication, the short 
duration of training leads us to question the findings. This limitation 
is also noted by the investigators. The authors further point out that 
the mean skill frequency of probing was considerably smaller than in their 
previous investigation, possibly a result of the short training inter- 
val. 

Both concept acquisition and skill acquisition were investigated by 
Kleucker (1974). She studied preservice teachers randomly assigned to 
four conditions: protocol training alone, skill training alone (micro- 
teaching) , both protocol ajad skill training, and a placebo, that is train- 
ing unrelated to the study. The two target behaviors were asking probing 
questions and offering accepting reactions. Protocol training and skill 
training led to concept acquisition and skill acquisition respectively when 
compared to control groups. But, protocol trainees did ^not perform better 
on concept tasks than those trained with raicroteaching, and those trained 
by microteaching did not perform better on skill tasks than chose trained 
with protocols. Training in both was at least equally effective and some- 
times significantly more effective than training in either alone, This 
study rates high confidence. Discussion of findings, limitations, and 
implications is thorough. As limitations, it ib noted that the small num- 
ber of participants may have contributed to no significant differences in 
some of the comparisons, that the* control group may have served as a treat- 
ment, and that instruction time was not held constant across conditions. 
Furthermore, Kleucker notes certain limitations in the criterion tests used. 

■ Borg and Stone (1974) made a pretest-posttest comparison of behavior 
changes brought about by the protocol modules on extension and encouragement. 
These Utah State University protocols are part of a series of six related 
to teacher language behaviors. It is important to note that all teachers 
were informed of the target behaviors prior to the pretest in order to 
eliminate one threat to validity. The threat was that positive gains would 
result not from the treatment but from subjects 9 posttest knowledge of the 
target behaviors. Results showed that teachers made significant gains on 
five of sev^b specific behaviors covered in the protocol materials. 

The second part of the study compared protocol modules and minicourses 
in effecting teacher behavior change. A nonequivalentr control group design 
used field test data previously collected with Minlcourse I, which trains 
behaviors similar to the extension protocol study, and with Minlcourse II, 
which trains behaviors similar to the encouragement protocol. Although the 
sample used in the Minlcourse I study Was similar to the protocol study, the 
sample from Minlcourse II was not. Both groups showed similar gains for 
most of the behaviors that were compared. Borg and Stone conclude that 
from a cost-benefit perspective, the protocol model might be more desirable 
than the minlcourse model for increasing the use of simple, clearly defined 
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teaching behaviors. This study rates low confidence because of the non- 
equivalent control group design and the use of change scores in analysis. 
The differences in sampling for the minicourse and protocol groups and the 
possibility of differential history effects also confound the interpreta- 
tion of results. ' 



Changes in Students 

Pupil behavior as well as teacher behavior was assessed in two proto- 
col studies. BOrg, Langer, and Wilson (1975) compared teachers trained by 
the classroom management skills with a no-treatment control group of inser- 
vice elementary school teachers. Changes in teacher and pupil behavior 
were assessed. Teachers using protocols were rated more favorably on all 
13 target teaching behaviors but differences were generally small and non- 
significant. For pupils taught by teachers in the experimental group, work 
involvement increased significantly and deviant behavior decreased signifi- 
cantly in recitation situations. In seatwork situations, although pupil 
work involvement significantly increased, deviant behavior showed no sig- 
nificant changes. Two reasons were given for the low teacher behavior 
frequencies: (a) the possibility that the observation time period was too 
short, and (b) the possibility that the observers became fatigued over the 
two-hour observation period. The results have a low confidence rating 
because of design deficiencies. Low Statistical power (N-29) may account 
for the nonsignificant changes in teaching behavior and in pupil deviant 
behavior during seatwork. Second, data were analyzed using analysis of 
covariance on nonequivalent control groups. It is possible that a combina- 
tion of measurement error in the pretest and differential growth patterns 
between the experimental and control groups may have led both to overadjust- 
ment and underadjustment of the data, washing out significant differences 
between the groups. 

A study by Borg in 1977 investigated the impact of two protocols, 
teacher-pupil interaction and pupil self-concept, on changes in teacher and 
pupil behavior. Subjects were randomly assigned to one of the two protocols, 
aach condition serving as a control for the other. With respect to changes 
in teaching performance, about one-half (seven of thirteen) of the classroom 
management behaviors increased and 11 of 12 self -concept teacher behaviors 
increased (except for four negative behaviors that had not been present prior 
to the treatment). For pupils of teachers using management protocols, no 
significant change was noted for work involvement, but significant decreases 
in°deviant behavior were found. Students of teachers using self-concept 
modules significantly reduced off-task behavior but did not reduce other 
target behaviors. There was no significant improvement in pupil self -concept 
for either experimental or control groups. 

The mixed results of this study are viewed by Borg as partially success- 
ful He suggests that teacher behaviors improved more with the self-concept 
module because less time was necessary for training these teacher behaviors 
than for training classroom management skills. Furthermore, Borg suggests 
two reasons that the improvement in pupil self -concept was small: (a) the 
possibility that in fact there is no relationship between the behaviors 
taught in the self -concept protocol and an improved student self-concept, 
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and (b) the possibility that overall differences were not revealed because 
most of the Anglo students in the classrooms had initially good self -concepts 
and hence, served to wash out the gains of the small group of minority stu- 
dents. Borg 9 s study rates high confidence. Although small sample size 
possibly contributed to low statistical power, the use of ANCOVA and the 
thorough discussion of plausible explanations merits some confidence. A 
no-treatment control group would have permitted assessment of cross-protocol 
effects, i.e., whether the self-concept protocol contributes to improvement 
in classroom management and vice versa. 

Borg conducted a 1975 study using the four Utah State University pro- 
tocols on teacher language. He compared teaching performance between a 
group trained in four protocols and a no-treatment control group of inser- 
vice teachers. He also investigated the relationship between teacher 
behaviors covered in these four protocols and pupil achievement as well as 
the relationship of teacher characteristics and pupil achievement.; Signi- 
ficant gains were made by the experimental group on all twelve measured 
teaching behaviors while the control group made significant gains on four 
of the twelve behaviors. When both groups 1 posttest measures were adjusted 
for pretest differences, it was found that the experimental group had sig- 
nificantly higher scores on four of the teaching behaviors. Borg notes 
that significant change for the control group for some of the teaching 
behaviors is in conflict with the premise that teachers' behavior remains 
stable over time without intervention. He suggests three possible explana- 
tions for his results: (a) changes in observer standards between pretest 
and posttest, (b) the content area taught for the posttest being more, appro- 
priate for language development, and (c) contamination (compensatory rivalry, 
diffusion or imitation of the treatment) . Borg concludes that contamination 
was the most likely cause of the control group's gains on the four teaching 
behaviors. In addition, partial correlations were computed between pupil 
\ achievement on two achievement measures and the 12 teaching behaviors. When 

\ pupil academic ability, parents 1 occupation, and teacher v coverage of the 

unit's content were partialed out, it was found that the teacher's use of 
defining, voice modulation, paraphrasing, and cueing were significantly 
related to student achievement on two measures and the teacher's use of 
\ opening review and terminal structure were significantly related to one 

\ achievement measure. However, none of the partial correlations between 

ten high inference teacher characteristics and student achievement were 
s igni f icant . \^ 

Several problems with the study reduce our confidence in its results. 
As in the Borg, Langer, and Wilson study (1975), an analysis of covariance 
was used to analyze data collected from nonequivalent control groups, and 
so significant results for four of the teaching behaviors may be the result 
of underadjustment caused by pretest measurement error rather than by the 
protocols themselves. Or the nonsignificant differences itey be a wash-out 
effect. Also, the possibility that compensatory rivalry, ot N diffusion or 
imitation of treatments took place on the part of the control group compli- 
cates the interpretation of the findings even further. 

Our review of research with protocols leads to several conclusions. 
Teachers and pupils appear to react favorably to the use of protocols^ 
Generally, teachers show significant concept acquisition from protocol \ 
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trainiag. For skill acquisition, results of protocols are not as clear, 
although in some studies skill gains have been documented for at least 
some target behaviors. Findings are also mixed when protocols* impact on 
pupil behavior and achievement is investigated. Each study finds some posi- 
tive effects for protocols. Further research should reveal for which teacher 
and student behaviors effects are most reliable. 

To our knowledge, no research has been conducted on the impact of 
protocols on college teachers and students. Since the training of teachers 
using protocol modules appears to lead to increased concept acquisition, 
colleges interested in this goal might explore the protocol format. Since 
protocol training requires neither videotaping nor classroom practice, it 
is less threatening and less disruptive of regular teaching than are prac- 
tice based programs. Of course, protocol development is expensive, begin- 
ning with the identification of concepts critical to instruction. Some of 
that fundamental work should be repeated for higher education, since it 
is by no means clear that existing protocols and the concepts they exemplify 
are the critical ones for the college classroom. 
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Conclusions and Implications 



We have reviewed scores of empirical studies of attempts to improve col- 
lege teaching. These studies evaluate interventions aimed at assisting fac- 
ulty to change their teaching activities or roles in order to enhance the 
educational experience for themselves and their students. Impact of the 
interventions has been assessed through measures of the professors 1 attitudes, 
through observations of their classroom behavior, through reports of their 
students about the class, and through measures of their students 1 learning. 

Our review was undertaken to determine what guidance this literature 
can provide to those who conduct research and to those who design and imple- 
ment instructional improvement programs in postsecondary education. In this 
final chapter we discuss several issues regarding research and practice. 
These issues constitute an agenda for our own subsequent research and writing 
and are discussed here only briefly. 

The literature on teaching improvement in higher education is larger 
than we had expected when we began this review. It is also of lower quality 
than we had hoped. Table 1 summarizes studies charted in Appendix A accord- 
ing to the intervention addressed and our confidence rating. Recall that 
our confidence rating serves only as an approximation. Our criteria are not 
rigidly fixed and reliability of classification may not be perfect- Never- 
theless, there are sufficient entries in most cells of that table to convey 
an adequate impression of the pattern of relative attention given to topics 
and of the quality of research from topic to topic. We also note in Table 
1 (in parentheses) the number of entries which support the intervention in 
question. This display suggests several observations. 

1. Most studies support the intervention in question. Overall, 82 
percent of the entries in Table 1 support the intervention being investigated. 
(Please note, that studies with multiple variables are entered in more than 
one category of the table.) 

2. Each specific intervention category receives support from at least 
50 percent of the entries. For 11 of the 13 categories, support is provided 
by 70 percent or more of the entries. 

3. The higher the methodological quality of the entry the less likely 
it is to support the intervention being investigated. Interventions are. 
supported by 93 percent of entries rated low, by 86 percent of entries rated 
fair, and by 60 percent of entries rated high. This does not mean that only 
highquality studies should be taken seriously. It may be that in fine tuning 
methodology, investigations have become insensitive to the phenomenon being 
studied. It is also possible that, since lower quality studies are flawed in 
different ways, combining their results exploits overlapping strengths, while 
not doing so would overemphasize their separate weaknesses, 

4. We have been particularly impressed with the research on interven- 
tions developed for precollege teachers- The precollege research on micro- 
teaching, minicourses, and protocols has in large part involved research 
programs rather than single studies and has shown awareness of desirable 
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^Numbers in parentheses represent studies which support the intervention. 
The total number of entries in this table (77) is greater than the number 
of studies in Appendix A (60) because several studies apply to, more than 
one intervention category. 
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design characteristics even when circumstances did not permit incorporation 
of all the desired features. It is worth speculating on reasons for the 
apparent lower quality and greater fragmentation of research in postsecon- 
dary settings . Are higher education researchers less competent, standards 
less stringent, problems more difficult, funding less available, or is some 
combination of these at work? 

A well-defined field of inquiry should draw upon coherent theory, sub- 
scribe to high standards of research, and build upon previous research in 
a systematic way. By these criteria, research on the improvement of college 
teaching does not yet constitute a well-defined field. For most studies, 
the basis in theory is strained and for some it is non-existent. Work on 
major conceptual issues remains to be done; before we can validate materials 
or programs for instructional impact, we must clarify the nature of "instruc- 
tion" and the meaning of "improvement. 11 These concepts are seldom explicitly 
defined in. this literature and, as we struggled with implicit definitions, 
they often struck us as inappropriately narrow. Further, a host of design 
problems plague this research. Finally, the field is fragmented because most 
research. is only asingle study effort. 



Implications for Research t 

We shall limit our discussion here to only five implications for research. 
They are general in nature but progress on them is basic to the further dev- 
elopment of the field. 

1. Individual difference variables deserve greater attention. Most of 
this research treats participating faculty as an undifferentiated mass, dis r 
tinguished only by the treatment to which they are assigned. More attention 
should be given to individual differences (either as independent variables 
or as blocking variables). The value of attention to individual differences 
is demonstrated by the studies of discrepancies between faculty self-ratings 
and student ratings. Systematic study of demographic information, motivation, 
and other self-described characteristics may assist in identifying those 
persons who are most ready to engage in change projects and for whom parti- 
cular interventions are most suitable. Likewise, when the impact on students 
of a teaching-improvement intervention is studied, individual differences 
among students should be noted; otherwise significant interactions will not 
be documented. 

2. Dependent variables require comparable definition and operationali- 
zation across studies. We hoped to aggregate the findings from studies qf 
several of the interventions under review. Foj: example, research on the 
impact of student feedback might be combined across studies according to the 
dimensions of the questionnaires used in each study. One hypothesis is that 
ratings feedback would have greater (and faster) impact on a "rapport" fac- 
tor than on a "course organization" factor. Since so few studies use the 
same questionnaire or analyze questionnaires in a similar way, our attempt 
at such aggregation proved futile. Seldom are common schedules for class- 
room observation used and in few fields are there standard measures of 
student achievement. Although studies should not require uniformity in 
design, they cannot build upon one another until some comparability emerges. 
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3. Much wisdom remains undocumented and unshared* A number of figures 
in the faculty development movement have accumulated impressive experience 
in a variety of settings and projects during the last few years, but little 
of that experience is systematized and available to others. For instance, 
many people have learned a great deal from the PIRIT project, and it has 
informed the design of a subsequent national project; yet little generalizable 
knowledge emerged from the research on PIRIT. The field needs better communi- 
cation channels to capture and share such wisdom. Although it may not itself 
be research-based, that wisdom is empirical in that it derives from experience, 
and it should play a critical role in the planning of subsequent research. 

4. Cross campus collaboration is absent. Appendix A studies are 
isolated efforts of investigators on individual campuses. Inter-campus 
research networks are potentially powerful tools for dealing with several 
of the problems we have noted. Wisdom from previous efforts would be part 
of the planning of such studies. Experts in research methodology could be 
part of the research team. Practical problems of research design such as 
random assignment and small numbers of participants would be alleviated. 
The time required for planning, data analysis, and writing could be shared. 
Similar collaboration is not unknown in other fields. For instance, coopera- 
tive clinical trials have long been used in medical research, but that method 
would be new to higher education. 

5. Most data reflect only superficial levels of experience. The studies 
rely primarily on self-report and questionnaire data. Seldom does the research 
go to levels of experience below the surface and reveal cognitive, emotional, 
political,, and developmental experiences. What goes on in the mind of the 
professor while teaching or while watching a tape of his or her class? What 
feelings are experienced while reviewing a computer report of student ratings? 
How do perceived rewards for teaching relative to rewards for research produc- 
tivity influence professors' responses to opportunities for improving their 
teaching? How do developmental tasks at particular stages of adult life inter- 
act with perceived teaching problems and challenges? 

The dominant research strategies in this body of literature come out of 
the quantitative methodological tradition and are insufficient for investi- 
gating questions such as those just listed. To advance the field we need 
careful classroom ethnographies, disciplined case studies, sensitive clinical 
interviews, as well as rigorous experimentation. The literature of higher 
education does contain exemplary efforts using several such methods. Andrews' 
(1978) case study is illuminating, pottle's (1977) essays are provocative. 
Axelrod's (1973) portraits of teachers provide unusual depth. Becker, Geer, 
and Hughes' (1968) participant -observations richly develop the context of 
student life. And Mann, Arnold, Binder, Cytrynbaum, Newman, Ringwald, Ring- 
wald, and Rosenwein (1970), document the classroom using multiple sources of 
data. Admirable as these efforts are, none is directed toward interventions 
for improving teaching practice. The necessary tools have been developed 
and their use has been mastered, but the quantitative and qualitative approaches 
are not yet intertwined and applied to the study of improving college teaching. 
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Implications for Practice 

What does this research offer those who design teaching improvement 
programs? What activities available to them should be supported for maximum 
impact and cost effectiveness? 

Given the mixed quality of research design, no conclusions can be dr^wn 
without reservation, yet several generalizations do seem justified. 

1. Workshops and seminars are useful instruments for motivating and 
consciousness raising under certain conditions. Nevertheless, most work- 
shops and seminars, even those with specific training goals, are unlikely 

to bring about lasting changes in classroom behavior or student impact unless 
there is provision for faculty to continue practicing the skills in question 
and to receive critical feedback on their efforts. 

2. Concept-based practice appears to be a promising tool, if education- 
ally critical concepts are selected. Discrimination training which is central 
to concept-based practice is less costly, disruptive, and intimidating than 

is training-with-practice which is required in experience-based training. 

3. End-of-course feedback from students has become institutionalized 
on many campuses. Little is known about how faculty "process" their feed- 
back, but active processing can be facilitated if the ratings are accompanied 
by other help, particularly by personal consultation. Those faculty most 
likely to change are persons whose ratings by students are less positive than 
their ratings of themselves, and they are probably the faculty in whom the 
time of consultants should be invested. 

4. Grants to support faculty-designed projects require considerable 
staff time if their impact is to be optimized. Staff involvement in refining 
proposals and carrying them out is likely to enhance the quality of the work. 
Staff assistance in evaluating the project provides a data base for making 
further awards. Otherwise, evaluation is unlikely to be done by the grant 
recipient alone. 

As a general note in conclusion, we observe that the study of these 
interventions, at least as it is conveyed in research reports, typically 
fails to engage faculty a* collaborators in inquiry. Instead, we make our 
colleagues the "objects" of our training programs and the "subjects" for 
our research studies. That situation is lamentable since the. questions 
about teaching and learning which engage this field are as intellectually 
challenging as any a scholar might find in his or her own field of speciali- 
zation. For the classroom teacher, such questions also have the attraction 
of day-to-day relevance. It is our hoye that in the next generation research 
will include fewer studies where faculty are assigned to treatments and more . 
studies which are collaborative attempts to grapple with the phenomenology 
of teaching and learning. From such inquiry will come fuller understanding 
of the operations by which effective instruction is carried out and of the 
impacts it hcs on learning. 
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Appendix A: Summary of Studies Critically Reviewed 

This Aooendix contains schematic outlines of the studies analyzed in the text. Criteria for including 
studies areTescr^ed i„ Cnapter I. Detailed discussion of several categories of these charts are also given 
in that chapter. Symbols frequently appearing in the charts are defined below: 

E - experimental group * - intervention or treatment 

C - control group (X) - alternate intervention 

ft - randomization groups not randomly formed 



^ ! pre and post data from different persons 

0 - observation i r r 

? - the information in question was not reported or was 
ambiguous in the source available to us. 

Threats to validity, general categories: 

SC - statistical conclusion validity C - construct validity 

I - internal validity E - external validity 

Lower case letters denoting particular threats within the categories are defined in Appendix B. 
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videotaped be- 
Uavlors using 
Flamlara In- 
toractlon 
Analysis 

2) Rating of 
cognitive 
levels of 
r.leuerouei 4 
quia queetloaa 
using a Modi- 
fied vsrslon 
of Tefccher- 
Pupil Question 
Inventory (Da- 
vis 4 Tliialoy) 

3) Cornell In- 
ventory tor 
Student Ap- 
ralsal of 
Teaching 4 
Coureee (Hooa) 



1) I group »ade algnlf lceotly 
greeter uaa of objactlvaa (p'.O?) 
4 engaged In algnlf leant ly mo re 
atodent-cantered t a aching (p*.04) 
than C group. 

2) e group atudoot ratings of ef- 
fectiveness wara significantly 
higher than for C group (p*. 10). 

3) Mo dlfforcncaa between groupa 
011 atudont-telk ratloa. 

4) For both groupa, uaa of indirect 
teaching akllla waa poaltlvoly 
correlated with student ratloga 
of Instructional af fectlvanaaa. 

5) No dlCferencoe between troupe 
on contjiuity oiaong cognitive la- 
ve la of cleserooM 4 quia quae- 
tlona. No relationship ahown 
bet on congrulty variable 4 atu- 
dent ratings of lnetructloo. 

(Due to aatall eaaule alia, elgnl- 
ftcanca laval of p<. 10 waa weed) 



Sevall M 



iaodoalsatloo 



fair 



>ecln lo aaecse tha af- 

l*4i) fect* ni a aemlnar 

00 tha teaching of 

psychology on 
teaching perfor- 
mance of teaching 
asslatante 



lurvcT i 
X 0 

X-ae;?laar In teach- 
ing of psychology 



Study it 
1:0X0 

C: 0 0 

X*ae«lnar In teach- 
ing of psychology 

Study 2: 

E: 0 X 0 0 (n-21) 
C: 0 0 0 (n«U) 

X-a««laer In teach- 
ing of rsycholoty 



4) fonsar oplnlona Survey of opio- 
partlclpante obtainod lone about 
of aenlnar ovar a 3 eo ulnar 

yaar 

porlod 



49 teaching 
eeeletente 

of psychol- 
ogy 



32 leeching 
aaeletente 

lo peychol- 



2 ss- 
Meetere 



Student rating 
lteias devel- 
oped by Iaaae- 
eon, McKonchle, 
4 Hilhollaod 
(1964) 



Student retlng 
for* dovolopod 
by Ieaaceon, 
McKejchla, 4 
Hilhollaod 
OW) 



Tuplca rated etoet Important vara 
related to practical overydey 
work of a college teecher. 



K group evida a elgnlflcant gain 
on 1 of 3 fectore, that of rap- 
port* although Magnitude cf dif- 
ference weg eeuill. 



1) Aftar 1 ecMeetcr, no elgnlfl- 
cant dlfferencee batweeo t 4 C 
grouna. Croup Interaction fac- 
tor approached .OS elgolf lcence 
level In fevor of £ group. 

2) After 2 aeneutere. K group re- 
ceived elftnlf leant ly higher Mean 
rjtlnga on 2 of 5 fectore. feed- 
back 4 interaction. 



- Fair 



77 



/Oats 



futposs 



- Compensnts of Design 

Cods participants 



Duration Instrumentation 



WORKSHOPS / SEMINARS 



Ststsd Results 



Thrssts to Validity 

SC 1 C S 



Ussknsssss 



Strengths 



Coaf lceccs 
taking 



To aaacaa tha sf- 

Ucta of a scalnar 
for teaching as- 
alttao;a on teach- 
ing perforsvsacs 



10X0 
10 0 



X-ilx 2-hour ssmlnsr 
aaaalona on lnatruc* 
tlonal organisation^ 
tachnlquca 4 avatar- 
lala 4 videotaping 
with faadback 



22 incxper- 1 tarn 
It* need TAa 
In econo- 
mic* , Dual- 
ncaa admlnl- 
atratlon 4 
geography at 
Unlveialty 
of Ullnola 



1) llllnola 1) ANCOVA uaed to adjuat for Inl- 
Courco Evolua- tlnl dlffcrencce between K 4 C on 
tion expert ratings. After adjuet- 
Qucatlonnalra e*nt # E group hlglisr on expert 

2) Instructor retlngs* 

Sslf-Evalus- 2) No significant dlfforencee bat- 

tlon Form ween E 4 C nn student rstlngs; no 

3) Teacher Per- training or teaching experience ^ 
foreuoee Ap- affect on eelf-eveluetlon pro- 
praloel Scale) fllce; no taechlng experience ef- 

4) informal fact on expert rstlngs. 
questionnaire 3) TA's rotlnge of training ssml- 
for TA evalue* oars were favorable. 

t|on of train- 
ing semlnsr 



Saall M 



1) Multiple 



2) Randoalze- 
tlon 



Fair 
(tentative; 
rating 
»as*o on 
ebstracs) 



To eaae»e effeete 

o£ Instructing 
teaching asiii- 
tanta In I'landara 
Interaction analy 
ala on classroom 
verbal behavior 4 
on atudent 
achievement 



£: 

C: 



0 X 
0 



0X0X0 
0 0 0 



X-tralnlng In Inter- 
ectlon enalyele 



g teaching 
aaaletente 
of the na th- 
ematic a de- 
partment at 
Eeet Caro- 
lina 

Unlvarelty 

(211 

atudente) 



qua r tar 



Audiotapes 1) Significant differences In fs- 

snalyted ueln^ vor of E on 4 of 9 verbel behs- 
Flanders Inter- vlor chsracteristlcs (1/0 ratio, 
ectlon Anelyele Stcsdy-Stats cclle, Aree A celle, 
(FlA) 4 Teacher Reeponee to Student 

cells). 

TA'swsrsdlv- 2) Significant dlfferencee in fe- 
Ided Into; vor of the mathematics educetlon 

e) mathematics TA'eon* of 9 verbsl behevior 
educetlon f A* e cherectcrlet Ice (I/O ratio, S/T 
b) msthemstlcs ratio, Steady-State cclle, Con- 
Ta *s tent Cro;.s ceiU, Teacher Re- 

sponse to Student celle, 4 Stu- 
dent Telk Followed by Teacher 
Talk cells). 

3) Slgnlficsnt differences In fe- 
vor of mathematics sducstlon 
TA's on student schisveaont » 



1) Unit of snal r 
yaia-atudente? 



Talr 
(tentative 
rctlng 
bsttd mm 
sbttrscc) 



To ssssss ths af- 
fects of a fsculty 

developocnt work* 
ehop on partici- 
pants' level of 
salf-sctualizatloo 



0X0 
0 0 



X*t-day fsculty 
development work- 
shop consisting of 
alscucslons 4 scrlss 
of micro-colleges 



3 



\ 



22 collegs spproxl- Psrsonsl Orlsn- 1) E croup made slgnlficsnt ln- 
profcesors stately I tetlon Inven- creased on 6 of 12 scslss (Innsr 
(partlcl- wssk tory (PDI) Dlivctedncee, Sclf-Actuallslng 

pants In 2 (Shostrom) Values, Existent iality, Feeling 

groups Reactivity, Acccptencc of Aggree- 

equatsd on slon, Capacity for Intimate Con- 

sgc h eca- tact) while no el^nlflcent 

demlc changes fur C group, 

division) 2) Pretests did not indicate sig- 

nificant dlffsrsncss between the 
groups. 



1) Noneejulve- 
lent control 
group dcelgn 

2) Small H 



1) Uss of 
sslf- 

sctuallss- 
tlon sa ds- 

' f cedent 
varlabU 

2) Theoreti- 
cal froce- 
work for 
faulty 
develop uc at 



Low 



ERIC 



WORKSHOPS / SEMINARS 



turpoae 



To aaaeaa the dif- 
ferential cffecta 
of Instruction la 
effective ques- 
tioning, 4 ttudent 
racing feedback on 
teaching pcrfor- 
Mnci of teaching 
itililinti 



Component a of Deaign 
Cod* participant* 



Bp t 0 X t 0 
C 2 : R 0 X 2 0 
C; 10 0 

Xi*»tudent rating 
feedback 

^••tudent rating 
feedback 4 Inat ruc- 
tion ia effective) 
questioning 
tcchnlquca (FLA) 



12 graduate 
teaching 
uasletanta 
randomly se- 
lected (torn 
the College 
uf Buainuae 
Administra- 
tion at Ari- 
zona State 
Unlvaralty 



Duration Inttrvmntation 

t I) rUndcra Sy- 

atcm of Inter* 
action Analy- 
ala (F1A) 

2) Purdue In* 
atructor Per- 
formance Indi- 
cator (Plfl) 

3) Mlnncaota 
Taachcr Attl- 
tuda Inven- 
tory (KTAl) 



Stated Reaulte 



X) No algnlftcAnt dlffcrcncca 
amonr, Rj, E 2 * C in teaching per- 
formance ae mcaaurcd through in- 
dlcca derived fromTA** cUaeroom 
behavior matrix (K1A) . 

2) No aignificant rulatlonahlp 
found between TA'a teaching per- 
formance aa mcaeured by HA, • 
Pli'i. 

3) Significant rclatlonahlp be- 
tween TA'a KfAI attitude acorea 4 
2 of 5 teaching performance P1A 
lndlcca (Uircct/lndlrcct in- 
fluence 4 T: echer/Student telk 
rut lot). Two other retloa aug- 
gcatcd a atrong eaaociatlon with 
KTaI. 



Threeta to Validity 
SC X C E 

e b 

c 



Waakncaaee 



Small N 



Strength* 



1) Kundomlie- 

tlon 

2) Multiple 
mtaauree 



Conf Uebce 
feu lr-> 

Low 
(tentative 
rating 
bae«.d on 
jbt tract) 



Tw avseia tSe ef- 
t'ecr* of a TA 
training pro^raa 
on teaching 
performance 



0? X 0 

X-tralnlng program 
on a wlda range of 
toplca (writing be- 
havioral ebjec- 
tlvae, art of quae- 
t toning, peraonal 
Interaction, 
aensltlvlty) 



Torching aa- 
alatanta In 
both geology 
4 chemlatry 
dapartmente 
at Florida 
State 

Unlveralty 



X) TA training cauted algnlflcent 
change In teaching behuvlor in- 
cluding lata teacher control, 
etofe individual Interaction 4 
euro high-level quaatlonlng. 

2) Uae of deelrad teechlng bcha- 
vlora reaultad In poaltlvo atu- 
dent ettltudea toward claae, TA 
ae lnotructor, science In ganerel 
4 lncreaacd eeW-learnlng. ^ 



No contrel 
group 



The Wichita State 
Study : to deter- 
mine the effec- 
tivcneaa of fac- 
ulty development 
• programs built on 
model of 'teachera 
helping teachera' 



E: 

C; 



K 0 X 0 
ft 0 0 



X»faculty develop- 
ment actlvltlea con- 
ducted In group aea- 
alona or dyada 



32 randomly 1-10 * 13 Item ttudent ANCOVA-adJueicd meaaurca algnlfl- 
aeicctcd in- wtokg rating for« centiy higher for t on 5 of U 
atructora meaaurca (total, ovcrell rating aa 

from 82 • teacher, dlecueeed oplnlona 4 

volunteera - ldeaa otncr than own » encouraged 

clasa dlacusalon, waa a war a If 
atudenta undcratood aubject 
matter); 

Clven lack of control over othor 
factora that might influence per- 
formance (e.g., short Intervention 
period, email N) , results offer 
atrong aupport for thla faculty 
development procedure. 



Volunteer 
aample 



1) Motivation 
waa controlled , 

2) llaodoc.ii a - 

tlon 



fair 
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fur pose 



Component a of Design 
Cods Participant* Duration Instrumentation 



Stated Result* 



Thrsats to Validity 
SC I c s 



Strengths Confidence 
Rat lng 



Tj assess the ef- 
fect* of a sc- 
«,uencs 01 faculty 

development work* 
shops on teaching 
performance ; 

1 



oxo 

X-Teschlng Improvs- 
meat Project System 

(TIPS) 



29 Instruct- epprcoci- 



ors of the 
University 
of North 
Dakots Col* 
lose of 
Nursing 



metoly 2 



Faculty Enrich* 1) Significant gains for 14 of 26 
stent & Assess* Items, The most significant 
■icnt of Teach- gains wcro associated with ltsma 
lng evaluation with factual content, 
system (FEAT) 2) Multivariate analysis found 
(developed as significant differences acr.ong 
tho University ^aln scores on 4 global vari- 
of Kentucky) ables. Univariate tests found 

significant gains for organlss- 
tlon, pressntation, 4 evsluetion. 



No control for 
course cootont 



Multivariate 
analysis for 
global 
variables 



To use*** Chi ef- 
fecifl of instruc- 
tional analysis 4 
t'cedb.icW from an 
instructional spe- 
cialist on the 
classroom behavior 
4 student achieve," 
oent of university 
teaching - 
assist iat« 



l x ; OX|0 


13 graduate 1 weake 


1) Videotapes 


student 


analyzed by 


fc2: 0 X 2 0 


teaching as- 


Flanders In- 


C: 0 0 


sistants 


teraction 




teach- 


Analyses (Aml- 


X^-rtvlew of dace 4 4 
remedial suggestions 


iitg a re- 


don 4 Flan- 


quired 


ders) 


4 activities with 


freshman 


2) 31 Item 


Instructional spe- 


rhetoric 


student eval- 


cialist over 6 week 


course at 


uation form 


period 


Univera lty 


(SCAT) of Cli- 




of Massa- 


nic to Improve 


}«2*review of date 


chusetts 


University 


with instructional 




Teaching 


specialist 




3) Student 
achievement 
teat (perellal 
forms) 



1) Among trends noted In dsts: 
K. 4 £2 Instructors lncrassed 
their using student Idess, 
focus lng, summarizing, lnt ro- 
duelng or orienting state- 
ments 4 lecturing, Percen- 
tsge of teschcr talk In class 
Increased 4 a u* dent talk de- 
classed. C Inst rue tors 
showed an Increase In silence , 
in thslr posttcst lessons. 

• Their using student Ideas ln- 
crea.ncd slightly 4 there was s 
decrease In focus Ing, summsr li- 
Ing, Introducing or orienting 
atateiacnta 4 lecturing, 

2) Among trends In student evsl- 
uatlona; Ej showed posit lva 
change Li clarity, evaluation 

4 feedback 4 relating to atttdant 
responses 4 C Improved In re- 
let lng to student responses. 

3) No difference* In ecltlcvs- 
ment among 3 groups. 



Small N (noted 
by Investigator) 



1) Use of oul- 
tipla 
measures 

2) Discussion 
cf 

limitations 



fair 



To assess the ef- 
fects of a seminar • 
on the teaching of 
economics on 
teaching perfor- 
mance 4 student 
performance 



0 0 



(n-323) 
0X0 <n»34S) 



X*scalner on teach- 
ing economics con- 
sisting of student 
evaluation Input, 
Vldcc&aped observa- 
tions^ 4 instruc- 
tional seminars 



761 students 
enrolled In 
principles 
of economic a 
course 

(same 7 gra- 
duate In- 
st rue tore 
involved 
over 2 



quarters 



1) Question- 
nslics dealing 
with student 
characteris- 
tics 

2) Test of Un- 
dcrstsndlng In 
College Econo- 
mics (Part I, 
Forms A 4 B) 

v i'JCL) 

3) Postcouree 
use of Purdue 
Ratine Seals 
for College 
Instructors 



1) Student performance of E group 
Increased slgnlf icsntly over C 

group. 

2) Instructor ratings of C group 
were significantly higher thsn C 
group, 

3) High association between In- 
structor ratings 4 .atudent per- 
formance on TUCE« 



Fair 



^ yJL_ 
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WORKSHOPS /SEMINARS 



furpose 



Componente of Design 
Cods Participant* Duration Instrumntation 



Ststed Meulte 



Threats to Validity 
SC I c s 



Vaakneaaaa 



Strengths 



Coaf ldence 
Ret lag 



To assets Cht tf" 

(vcti of u train- 
ing progr&o for 
teaching hiiIi- 
tants en verbal 
Inter. c; Ion 4 
que at loolng 



t: 

C: 



10X00 
to 0 0 



X» seminars , micro* 
teaching, observa- 
tion 4 conferences 



New teaching 
assistants 

In freshman 
chemistry 

(number not 
specific J In 
abstract) 



1 tarsi 



1) AudJocapes 
coded accord- 
ing to: 

s) Flanders 
Interaction 
Analysis Ca- 
tegory Systesi 

b) Question 
Category Sy- 
stem for 
Science 

2) Placement 
testa In A 
fields of 
chemical 
knowledge 



1) C troop more succoeeful In 
drawing students Into discussion. 

2) £ group lectured less 4 used 
more praise 4 encouragement* 

3) C group asked more questions. 

A) Trclning program showed no ef- 
fect on type of question asked 
or on proportion of correct re- 
sponses elicited* 



t t 



1) Multiple 
•assures 

2) Randoalca- 
tlon 



fair 
(tented 
rstlng 
based on 
ststracc) 



To acsess the ef- 
fects or a train- 
ing program for 
(caching assis- 
tant, on teaching 
performance 4 
student-teacher 
Interaction 



0X0 

' X-ten 1-hour ssmU 
liars baaed en ra- 
tional* of Intsrec- 
tlon : "Vila for 
Selene ichlng 



12 teaching 
assistants 
In biology 
at Georgia 
State 
University 



1) Teacher 4 
stuJjtit beha- 
viors coded 
usinj; Interec- 
tlon Analyala 
for Science 
Teaching 
(VST) 

2) Nonverbal 
movement of 
TA*a was 
recorded 

3) Questions 
asked by TA'a 
were analysed 
for number 4 
level 

A) Rokench Dog- 
matism Scale 

5) Kolu Con- 
flict Test 

6) Teacher 
Concern 
Statements 



1) Significant changee In the fol- 
lowing IAST ratloes l/D teaching 
ratio, 5/T talk ratio, revlacd 
I/D teaching ratio. 

2) Significant change In teacher 
behavior block, nn Interaction re- 
gion on the 1AST matrix, but no 
change a In 3 other blocka. 

3) Significant change In nonverbel 
movement of TA. 

A) TA's Increased amount of time 
spent with students. 

5) Significant changes InVA'e totel 
number of questions .4 number of 
convergent &' divergent questions, 
but no change In managerial 4 
rhetorical qucetlone aaked. 

6) No significant changes or Cor- 
. rclutlons for other scslae 4 

measures* 



e 

b 
c 
e 



Small V 



Multiple 
uessuree 



si 



O 
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WORKSHOPS /SEMINARS 



furpose 



To assess the dll- 
ftrsntial offsets 

of training 
teechin; assis- 
tants In Poiya's 

heuristic ques- 
tioning strati- 
gits tod/or Vl«n- 
ders Interact loo 
Analysis on verbal 
interaction, pro- 
bldi solving se- 
quence, achleve- 
ccnt c» evaluative 
perception oi TAs 
oy frCutieutS 



Coaponsnts of Design 

Cods Participants 



243 under* 
graduate 
students la 
Introduc- 
tory calcu- 
lus courts 
for najors 
other thou 
math or en- 
gineering 
(18 calcu- 
lus sec- 
tions 
Involved) 



Duration Xnstnsmntation 

t I) Flanders In- 

teraction 
Analysis (FIA) 

2) foiya's 
hcarlct lc 
'questioning 

strategies 

(niQPs) 

3) Donkey -«ule 
problem 

4) Achievement 
test coverled 
vltU course 
gradi-., CCEi 

k scores, & 

Nelson-Denny 

vocabulary 

scores 



Stated Results 



Threats to Validity 

sc X c s 



Wcakosssss 



Strengths Con2Har.es 
Aating 



Overall, FIA 4 tl$\n training of 
TA*s significantly affsctsd vsrbal 
lnteractloo of TA's vlth thslr 
students, ths problem solving ss- 
quencs of the TA's i tholr stu- 
dents, ths achievement of TA f a 
students, 4 tha svsluatlvo per- 
ception of TA'sby their studsnts. 



Multlpls 



lair 
(tsntarivi 
rating 
based oa 
abstract) 



X-ln-eervlce pro- 
gre« Involving 



To aasess the ef- t E: X 0 X.O 1 
ftcts of an in- t C: 0 f 
service pro^rsa 
(or teaching fel 
low* on attitude 

towjwd leaching as workshops and 
s car«er, job sat- consultstlon 

lsfactlon. Ir.ter-' 
personal style of 
teaching, 4 stu- 
dent satisfaction 
with teaching 
fellow 



1-15 teach- 
ing fellows 
In chemistry 
department 
st Unlvsr- 
slty of 
Michigan 

OnuabeVs not 
reported 

(498 

students) 



2 terms 



1) Students of t group aors satis* 
flod than studsnts of G group st 
snd of fell tsra* Winter tsra 
students of E group sort satis- 
fied then fall 1 tsra students of 

E group* 

2) Change In attitude tovsrd 
teaching seeas relstsd to recon- 
sideration oo pert of teaching 
fellows of rsletlve advantages 4 
disadvantages of tsschlng. 

3) Change in job satisfaction ■ 
secmi. to be related to level of 
ambivalence toward teaching. 

4) Change in self description 
scents to ba related to certain 
perceptions of potential for sn 
lntsrpsrsonal styls. 



St 

ht 
It 



<V 7 



er|c. 



n 



MICROTEACHING 



Author /Date 



! Of CwDi, 

A I len 



purpose 



To ssjcss the ef- 
fect* ot the acin- 
ic cd Su^eer Kicro- 
Ceachln^ Clinic 
(ISaS) oti teach lot 
per forewnce 



Cods 



X 

i 



Components of Design 



o 

H 

O 
X 
O 
X 

o 

M 

o 



Furtioipcnt* Duration In* trumtntation 



140 second- 
ary educe* 
Clon toecher 
interne 



4 weeke 



1) Stanford 
Teacher Cotupe* 
tcuce Appreie- 
al Guide 
(STCAG) 

2) Qucetion- 
nelre to evel- 
uete student 
acceptance of 
•icroteachiag 



Steted tceulte 



1) Trainees showed significant 
neen gain over 6 week eeeelon on 
9 of firet 12 STCAG It cm, 

2) 70% of trainees indlcetcd eu- 
prrvlsory feedback wae ueeful 
while 24X Indicated pupil feed- 
back wee ueeful. 



Threete te Validity 
SC I C * 

b • 

c 



1) No control 
group 

2) Possible 
testing 
effecte 



Strengths 



Replltetion 
study of ISO 
4 1964 clinics 



Confidence 
Ra:lng 



o 

X 

o 

X 

o 

X 

o 

X 

o 

X 

o 

X 

o 

X 

o 

X 

o 

X 

o 

X 

o 

M 

o 

X 



Jensen 4 

Young 
U*72> 



To a,9»eee the ef- 
facte of tUcro- 
teaching training 
on. subscc/Junt 
teaching perior- 
manc«* 



E: ft X 0 0 0 37 eubjecte 3 eoe- Teecher Perfor- £ group received higher rat Inge 

C: R (X) 0 0 0 eelected elone of stance tvalue- on 5 of • factore (pereonellty 

froa e nicro- tlon Scale traite, teecher warmth, general 

X-nlcrotcachlng teacher toaching clnseroom ote»ephcre, leeeon uee- 

trelning trelning 4 • fulness, teacher Interest in pu- 

progran weeke In pU«) ch * n c P«pU«. Microtcech- 

(X) -convent lonel eeeigned ing le beneficial although sups- 

etudent teaching clsss- rlorlty soaetliaee not evident 

prectlce roo» until third obeervatlon. 



Nigh 



i 
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MICROTEACHIKG 



Author /Dete 



Component a of Deal go 
Cods Participants Duration Instrumentation 



Stated Xeaulta 



Threate te Validity 

S7 I C M 



VukMiiii 



St ran; the 



Conf Uan:a 



Johnoon 


Te aasasa the ef- 


0X0 


14 commu- 


mm 


fects of combined 




nity/ Junior 




training In Flan* 


X-cmmblnad training 


college 




dart Interaction 


la r lends ra luter- 


prof tttort 




Analysis 4 nlcro- 


actlon Analytic 






teaching labs oa 


(FU) 4 in micro - 






Instructor Inter- 


taechlag 






action behavior. 








qutstlbnlng 4 






• 


relnloi ccrwat 








techolquce 






Kallenbach 


To eeecsa the ef- 


t. 10 X 0 0 0 


37 etudente 




fect* of olcro- 


<n-l*) 


eclccted by 




teachlng training 


C; 1 0 (X) 0 0 0 


education 




on t -.t j «c<ju«nt 


<n-18) 


department 




tcjtlt lnj 




to begin 




periorr^oce 


X-mlc rot a aching 


elementary 






training 


teacher 








training 






(X) -convent ion* 1 


program In 






atudant teaching 


aunuwr 1964 






practice 


(San Joae 








Stete 








College) 



one 
summer 



approxi- 
mately 1 
yeer 



Analyala of vld- 
eotapee ualng 
m 4 author 1 a 
deecrlptlone of 
questioning 4 
reinforcement 
tcchnlquaa 



1) Stanford 
Teacher Compe- 
tence Apprels- 
el Culde 
(S7CAC) 

2) Instrument 
for the Obscr- 
vetlon of 
Teaching Acti- 
vltlaa (IOTA) 



Trelueee Improved algnlf leant ly 

on ell I verleblea: 

e) Significant gelna were ehowm 
for pupil talk, tcecher quest lea 
retlo, direct or Indirect In- 
fluence reinforcement , probing 
queetlone 4 higher order 
queetlone. 

b) Slgolflcent reductloe eh own 
for toacher talk. 



1) No dlfferencee between I 4 *Z 
on poet-training ratlnga. 

2) The two group a did differ on 
pretcet measuree ao ANCOVA wee 
carried out but no elgnlflcent 
dlfferencee were found. 



1) Volunteer 
•ear? I a 

2) Stall N 



Loae of seme vid- 
eotepee (pro* 
•lea noted by 
invest Igstor) 



Good 

discussion 



High 



krecer 4 
Pcrlberg 
(l»/9> 



To jtiv^i the ef- 
fects of atlcro- 
tcacnln*- training 
In Independent 
learning teaching 
strategics on 
teach Lag 4 pupil 
parfomance 



XT 0 X 0 
X? 0 0 



X-Vork*hop Involving 
damonatretlon, dis- 
cuselon, peer leech- 
ing 4 ailcroteechlng 



22. c lctrton- 
tary lnssr- 
vice 
tcechere 

4A6 pup He 
of g-12 
ye a re of 
age 



1 achool 
yeer 



Analyala of 
videotapes; 

1) Teaching 
style meosursd 
by behavior 
count t using' 
Vcrbtl Inven- 
tory Category 
SyaLcie (Amldon 
4 Hunt or, 
11)67) 

2) Fluency of 
pupils' quee- 
tlone measured 
by counting 
their number 

3) Level of 
puplle 1 quee- 
tlone 4 pro- 
bice* enelysed 
ualng cata- 
gorWa aug- 
gcated In 
Bloom 'a tuxa* 



1) E teachara telked less, geve 
leee information, eeked brooder 
queetlone 4 geve more dlractlone 

then C teachara. 

2} t puplle ehowed elgnlflcent 
behavior chengee compared to C 
group for 3 of U varleblee, thet 
of rcaponde to teacher, lnl- 
tletoe talk to tc.ichsr 4 lnl 
tlatce talk to another pupil. 

3) E puplle showed elgnlflcent 
Increases In number of p rob lama 
4 queetlone voiced, but elgnlfl- 
cant dlfferencee In higher level 
qucsllone for E puplle only 
found for 2 of 7 verleblea, di- 
vergency 4 enalyele. 



1) Good liter- 
ature review 

2) Thcorotleel 
fraaework 

3) Indue Ion of 
qualitative 
dete 



fair 
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MICROTEACHING 



Authot/t)AC« 



f *r Iberg , 
Ha: -On, 
levtn, 
Bar-Yaa, 
L*vy & 



futpwee 



1) To attest th« 
effects of nlcro- 
teachinj trslnlng 
cocbltcJ with 
Tccl.tiUn Diagnos- 
tic Systto (IDS) 
computerised 
f^ecK,iC< oa 
tcacn! nc 

per £ on* ince 

2) To investigate 
linear reljtion- 
S.)lp$ between ths 
• ti.J jr.t '* pcrtor- 
cu.r.cc in dtfler- 
tr.t lissom 

j) lo ufcWhS dif- 
ferentlal etfecte 
of tr .*t-.« nt oa 
crpc : . z <>ntal 
•utjtt^upfc 



Components of Design 
Cods Participants Duration Instrumentation 



SUCtd Results 



0X0 



X-microtssching 
trslnlng combined 
wlch Tcchnlon Dlsj- 
ooiclc System coa- 
putcrlzsd feedback 



60 students 
la Teacher 
Training 
program st 
Techulou In* 
st Ltuto en- 
rol le J in 
'Principles 
of Teaching 
Methods' 
course 



approxi- 
mately 2 

••MS" 

cere 



ratings st vid- 
sotsped loe- 
sons on 13 
categories 



Threats to Validity 
SC 1 G £ 



1) Tratnses showed significant 
chnngce on ell 4 cotablned ecoree 
(non-verbal, not lecturing, re- 
lates to, analytical thinking) • 

2) Incrcasee in flret three com- 
bined ecorue reached" peak at end 
of trslnlng 4 poettcet showed e 
decrease from the laet training 
•osslon, 

3) t!o linear relationship found 
between pre end postteet ecoree. 

4) Ircstncnt effective both for 
cliotu with low entry behavior 4 
thotu with soaio teaching exper- 
ience. Low entry participants 
gained mors fro* trsstsient. 



Vssknsssss 



No control 
group 



Strengthe 



Uea of lechnion 
Dlignostic Sy- 
eteo coc^uter- 
lxsd feedback 



Cnr.fidt.ice 
Rat in- 

Low 



Pcrlbtrg. 
V*ri. Waln- 
St tier, 
S S-.i-roa 
(1*72) 



I) To .iisi-js the 
effect* of nlcro- 
C'CAClwr,^ tialnlng 
In stuc«t>t- 
c entered 4 c«sss- 
roox Interaction 
styles on teach- 
ins per fortune a 

Z) To Investigate 
C*.e relationship 
betucea chan£«s 
effect* d by 
clcrctt j jl.li*g 4 a 
pprt le ip«nt ' 3 
open net* 



0X0 16 feculty 2 sa- 1) ftokeach'a 

neubsrs, 30- quencea dogisatle* 
X-mlcrotsschlng 60 yeare old of > acele 

training wecke 2) P-A 

each (pemieeive- 
( each author 1 ta r lan) 

feculty ecale 
Member 3) Bipolar ed- 
went Jectlve ecale 

once a (based on Oe- 
weck) C*>od Semantic 

Differential) 
4) Flaiidcra In- 
f tcnxtlon 

Analysis ueed 
to analyse pre 
4 post 
vldeotepea 



1) Trainees ehowed significant Im- 
provement on all 7 teechlng ekllle 
(Icieon organisation, lecture 
styls, providing sxawplce, fluency 
In qucetlons, probing Question*, 
higher-order questions 4 dlvsrgsnt 
questions) . 

2) . Differences for questioning 
eklllu wors grestsr than for Ise- 
t urine skills. 

3) Trainees fchoved subritentlal In- 
crease In us a of all questioning 
skflls, the Increase In high o*dsr 
4 divergent questioning being ths 
great us t . 

4) Pcrscrvsrsncs -in silcrotsschlng 
clinic found to bs beet predictor 
of opvuness to chsngs 4 willing- 
ness to sccspt lnnovstlon. 



a 1) No control Mlcrotatchlng 

b l^oup applied to 

c 2) Subject higher 

•ore* lie y education 



Low 



Author /Data 



rVrry. 
leventhal 4 
\braal 
(19?*) 



f\tXfQ— 



To ths af- 

fect! of the Modi- 
fied Observational 
Learning (>tOL) 
procedure on 
tc-chin* pcrfor- 
r.ur.c* cf instruc- 
tors who differ I* 
pretrslnlnj teach- 
ing ability 



Components of Design 
Code mrtioipoite Bmittoei Jnstnemen to- tion 



E: * OiX 0 

C: AO! 0 
i 

X-Modlflcd Obeerve- 
tlonal Learning 
(V.OL) Involving 
videotape feedback 
cognitive discrimi- 
nation training 



157 intro- 
ductory psy- 
chology etu- 
denti at 
University 
of Manitoba 
(4 Instruc- 
tor* 

Involved) 



quilt lonnalre 
coneUtlng oft 

1) 2 single 
Item student 
rating 
measures 

2) achievement 
subecalos; 

a) etudent, 
competence in 
chomUtry 

b) content co- 
vered In lec- 
ture naterlel 



tested Reiulti 



Threats to Velldlty 
SC I Q t 



1) for huh effective lecturers. 
MOL training produced more fevor- 
ebte student retlnge on teaching 
ebltlty, on lecture velue, 4 pro- 
duced greater student echlevement 
than no training. 

2) For low effective lecturera.MOL 
training produced leie favorable 
rating! than no training on lec- 
ture value 4 no difference! on 
teaching eblllty 4 etudent 
echlcvueent . 



V 



to 



MICROTEACHING 



Weekneeeee 



1) Unit of anal- 
ysis-students 

2) Scparete 
pretest- 
posttest 
samples 

3) Novice 
Instructor! 



Strength! 



1) Mlcrotcach- 
log applied to 
higher 
education 

2) Multiple 
ceeturee 



Coaf loonce 
Rating 

r*tr 



Wagner i) to esussi ths 

(1973) relative effects 

of co^nltlvs dls- 
criDln^tlon 
training, clcro- 
teaching 4 s con- 
trol condition on 
student -cent ered 
teaching 
per fern uittcs 
2) 1 o A.r » ths 
i. lull-* e fleets 
t ui c. /.live dis- 

cs, m;. aloe 
t;ami:.«!, micro- 
teachirj I* « cca- 
i trot condition on 

teect.efs* ability 
to d.»ciinlnate 
cliiS>ei of taech- 
lng ttluv*or 



E 2 : 
C: 



R X\ 0 0 
R X 2 0 0 
R (X) 0 0 



Xl*co£nlt lvs dis- 
crimination training 

X2*mlc rot caching 
training 

(X)-conventlonal 
etudent tcarhl&g 
practice 



75 undergred* 
uetos from 
5 eectlone 
of Introduc- 
tory educe- 
tlonsl pay- 
chotogy 
course 



epproxi- 
euitely 2 
week! 



1) Observer 
ratings of 
teacher re- 
sponse! to etu- 
dent constant 

2) Dlecrimins- 
tlon test con- 
eletlng of 
coding teacher 
responses to 
students 1 
cqmaents 



1) E 1 group was slgnlf lesntly store 
student-centered (eek for clarifi- 
cation, roetete, uso of student *e 
Idea) than E{ or C groups. 

2) C 2 group not sifcnlf leshtiy store 
etudent -centered then C group. 

3) Kl **o»t bettor eble to dlecrl- 
talnata toeclilng behavior* thsn C. 
%2 did not differ froa l\ 4 C on 
discrimination test. 



1) Time lag bttr 

ween obeerve±«x 
tlone (noted by 
Invest lgster) 

2) Short 
duration 

3) Discrimina- 
tion test pre- 
cluded assess- 
sent of whether 

^ subjects had 
learned to at- 
tend to rele- 
vant dimension 



Coed diecuaelon Bigh 



o 

ERLC 



MINICOURSES 



Author/Date 



furposs 



Covponsnts of Dsslgn 

Cods Participant* 



Duration Instrumentation 



Stated Miulti 



Threats to Validity 

SQ I C I 



Vssknstsss 



Strength* 



Con? 'deuce 
Rating 



Borg To Invest lftate the 

(1972) persistence of be- 

havior change of 
teachers complet- 
ing Hlalcourss 1 



0X00 

X-ttlnlcourss 1: Ef- 
fective Questioning 



24 of 4tt approxU Scoring of vl- Initia l E:< pctlw ent (N »41) 

clcmcntory stately 3 dcotspe trsn- II of 13 tenchor 4 etuJent 
teachers yeere script* on use haviure showed nlgulflcant lsr 

vho parti- of Minlcoursa I provement 

cipctcd lo skills A Mon ths L ater ( N»38^ 

Initial a) 3 of U measured skills con- 

experiment tinned lu lwpro'/c 

b) No sltiiiitlcuiit regression on 
any skill 
3V Months Later (N"24) 

a) 6 of 10 measured behavlure 
still slunlf ienntly superior 
ccaiparcd to prccourso means 

b) Tcauhor talk regressed signi- 
ficantly but still bslow Initial 
frequency 

c) 1-word pupil rcsponss frequen- 
cy was "p significantly 4 higher 
than prs-cours* mean 



1) Voluntssr 
sample 



Discussion of 
Uoltstlons 



fsir 



Suttery 4 

Michalak 

(1973) 



To assess two Mod- 
ifications to chs 
niniconrss format: 

a) uic of "Teach- 
ing Clinic" pro- 
cess feedback 
system (no video- 
tape equipment 
used) 

b) naturalistic 

•tLtlo* 



E:RT; 0 Xi 0 
C;R7; 0 (X) 0 

X|"Minlcourse 1; 
Effective Question- 
ing coupled with 
"Teaching Clinic" 
feedback 

(X) ■•convent Ion si 
student teechlng 
prac t les 



40 undergrsd- 
uata cle- 

nentary 
school ma- 
jors at Uni- 
versity of 
Georgia 



epproxl- 
swtsly • 
vasks 



Coding of sudlo 
caasstts tsps 

recordings for 
13 teaching be- 
haviors rols- 
vnnt to Mini- 
course 1 



11 of 13 , t* ratios significant at a c 4 a 
.05 level for E group (3 elgnlfl- d » 

cent at .01 level) while 2 of 13 h c 

't' ratios significant st .05 
level for C group. 



Fair 



Collins To asr.ea* the ef- 

(1978) fects of sn enthu- 

siasm nlnlcouree 
on. subsequent 
teaching pcrfor- 



E: R 0 X 0 0 

C: R 0 0 0 

X-nlnlcourse on 
snthuslssa 



20 prcscr- 
vicc ele- 
mentary 
teachers 
(partici- 
pants not 
aware of 
experiment) 



$ weeke 



1) KaLlnu funs 
to aescss 6 
variables In 
tcrtns of level 
of perfunuance 

2) Tally sheet 
to record fre- 
quencies of 6 
variables 



1) B group shoved a significant 
Increase in teachur anthuelaasi 
between pretest 4 posttest 1, 4 

pouttect 1 4 II. 

2) No differences In ths C group 
In teacher cnthuslas* among 3 
teetlng perlode. 



I) Obeervere 

blind to ex* 
perlaent 

k) Ra&4oai na- 
tion 

a) P.epaatad 
cxatcrts 

akova 



High 



07 



ERLC 



• 



I 



MIN1C0URSES 



AutHor/Oata 



furpesa 



Components of Design 
Cods Participants Duration Instrunsntation 



Seated tttulct 



Thrssts Co VsllJlty 
3C I C E 



Wsskneseee 



Str*n;th« 



Confidence 
Haling 



ferrott, 1) To ««se«« the 
Appicbea, effects of Mioi- 
Hccp. 4 course 1 oo 

Watson tcjcnlcvi perfor- 

(1*75) ur.ee io Creat 

3ritaio 
2) To Investigate 
thf international 
tuniier of Mini- 
course 1 to Great 
Irltelo 



1 X. 0 2, 0 0 
10 



Xj-loformlng of er- 
get behaviors et 
pretest 



X^-Mioicouraa t: 
Etfcctlve Questloo- 



21 ineervlce 
Junior ii se- 
condary 
school 
teachers 



1) Scoring of 1) Mo significant dlffsrenca vee 
videotapes od found between C 4 C on effects of 
1A aspects of knowledge of target skills on 
teaching ba- prc-course performance. 

hsvlor related 2) Multivariate effect for time 
to Hlnlcourse was highly significant while «ul- 
1 tlvarlata effect for centre X 

2) Question* time Interactions was not slfcnl- 
nalra on flcant. 

teacher's 3) For planned contrasts, I of 14 

perceptions of awasurcs showed slgnlflcent dif- 
corrti affects fcronccs between pre-courss 4 
both post-course sessions. 
A) Findings suggest that f sal liar- 
lty with videotaping at poatteat 
•uy ba a causa of differences 
between pre- 4 post~courss psr- 
fonnancs. 



1) Well- 
planned oul- 
tlvariata 
analyses 

2) Replication 
of Dorg (1970) 

3) Presents 
evidence for 
mixed findings 
of stability 

of teaching psr- 
forcincc over 
Cite 

A) Kandc-U&tlac 



Ilgti 



23 
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ERIC 
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RATINGS 



Author /Date 



Furpoec 



Component e of Deelftn 

Cods Fai'tiaipant* 



Duration IrwtrumsnUition 



SCaCad ReeuUe 



Threece CO Validity 
SC J C f 



Weakneeeee 



£Crcn.*ths 



Conf idance 
Rating 



Alcimonl 

(1573) 



To aeaeat the com* 

blru-d effects of 
student ratlnj 
feedback and con- 
sultatlon on fac- 
ulty performance 
fro* one teoester 
co the next seoes- 
Cer In which cha 
course la Caught 



I 

O'X 



X ■ acudcnc rac- 
ing feedback 4 
con*ulcat Ion 
(Involved problem 
ldcntl f lcoclon m 
ingestions for 
resoluC Ion) 



E-20 In- 
aC rucCora 
t caching 
2U coursss 

C-13 In- 
st met ore 
teaching 
18 couraaa 



035« acu- 
dcnc 9 In- 
volved, t 
course 
w.i a Che 
unlc of 
analysla) 



1 year or 
\\ yeara 
dapandlng 
on when 
Couraa 
vaa 

CaughC 



Illinois Couraa 
EvalueC Ion 
QussC lonns Irs 

(CEQ) 



C algnlflcandy iapravsd on 2 
(Courss Concanc and XoaCruccor) 
of 5 dimensions 



1) All aubjecca 
wancad Creet- 
pene (reacr.cful 
dcnorsl lzaclon) 

2) RcpetCad »e*- 
eures ANOVA 
ar.alyala <*>y 
hava been 
lnspproprlscs 



Low 



gledsce 

(I97i) 



1) To assess Cha 

efface a of old- 
c«ru student raC 
feedback en end- 
of-ccr? faculcy 
performance 

2) To ccrpare In- 
structor *elf- 
ratings with 
class ratings aC 
ciC-tera and 
tt.d-of-Urm 



0X0 

- X-sCudrnt and In- 
structor racing 
feedback and eCu- 
denC- Instructor 
dialogue concerning 
rst lngs 



1 luaCruccor 
and 31 ad- 
vanced grad- 
uate stu- 
dents sc 
UnlverelCy 
of Georgia 



1 ejuer- 
car 



26 ice* scso- 
derU eveluedon 
FaculCy-Couraa 
Eveluedon For* 
(uacd at Unl- 
veralCy of 
Ceorgln) 



1) lnacruccor received algnlfl- 
candy higher end-of-term clsss 
svslustlons ss * reaulc of mld- 
Cern class feedback end dlelogue, 
buC lnacruccor decrsstsd hie 

eol f-evaluadon. 

2) Correlaclon for clasa moens on 
Items rorrclaccd .VJ on 2 occe- 
aJoua. Mld-tcrm eel f-radnge 
correlated .60 4 .65 wlch cleee 
eveluntlone. 

}) CeaCctt (alna made on Items 
Kated lowcec aC mld-cerm. 



0 0 0 

• c 0 
o o c 



Only 1 ln- 
acruccor In- 
volved end he 
wee elso che 
axpcrlococer 



Uv 



lr.. un^teln, 
Klein 4 
lochia 
(1973) 



o 

ERIC 



1) To asaes* cha 
efltct* of nld- 
tcr^. >tu.:ont rac- 
ing ir.chack on 
end-o i- teru fac- 
ulty performance 

2) To explore Cha 
affects of dla- 
crepanclea be- 
tween laid- term 
faculty fcclf- 
ratlr.^ a-.t! stu- 
Jc;.L r.tii.^ on 
end* o ; *« t rc 

rj: ir.23 



10X0 
R 0 0 



X-studrnt racing 
feedback 



AC Oakland 
University 
In DccrolC: 
E-15 classes 
(10 differ- 
ent profee- 
eore) 

C-12 cleeeee 
(9 different 
prof eseors) 



emoaCer 



2y Item teach- 
ing evalueclon 
lnacrumcnc 



1) Change ecore unslysls wae ueed 
due to nonequlvolonce of groupe 
JU Indicated on nld-tern ratlnge. 

2) E showed e strong Increeee In 
poeltlvu chengee while C showed 
strong lncrcet.ee In ncgetlve 
changee. 

3) When en Instructor's expectency 
le dlecrcpenc from ecudenCe'. rOC- 
lnj;« for a tralC, a eubeeqceoC 
shift for Chac tralc le likely. 



1) Theoreti- 
cal frame- 
work fo* 
e>.perl:wnc 

2) Thoroj'h 
dlscussioQ 

3) Rer.doQl- 
tetloo 



Fair 
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RATINGS 



Author /Date 



furpoee 



Component a of Deelgn 

Cods Partioipiint* 



Duration Inttruwwntation 



Stated JUiulti 



Threat* (o Validity 

SC I C Z 



U«akneeeae 



Strtnilha 



Conf ldanca 
Racing 



lutlir 4 To eesess (ha el- 0 0X0 
Tiptoe fectt of ald-tara 

(1174) student rating X-etudeet rating 

feedback on «nd-ef- faadbsck 
tara Instructor 
performance (and to 
investigate tha re- 
llsblllty of atudaot 
ratlaga war tie*) 



17 lnstruc- 
tora froa 
English Da- 
par taent st 
Virginia 
CoKoouwcal th 
University 
(1000 etu- 
dtnti 
involved) 



spprovl- rating icili 4 of 17 lnsiructore shoved elg- 

aetely 3 cooaletleg af nlf leant laproveaent on poet- 
aenthe 13 ltama rating* (student ratlnga avaragad 

across 13 lteae) 



Volunteer 
tsapla 



Contra 1) To antit tha 

(1973) effect* of «U- 

aeaeeter student 
rating feedback 
on .subsequent ft' 
culty pertornsnce 
across several 
typea of post* 
eccondary 
institutions 
2) To mm tha 
effects of stu- 
dent - instructor 
rjting discrep- 
ancies at olJ-tera 
on •nd-oi-ttr* 
faculty perfor- 
uance 



C 3 : 



10X00 (n-») 

to 0 

1 0 0 (Ml)) 

0 (n-30) 



X»etudent rating 
faadback 



Inatructora 

froa 5 

instil utione 

Mid- 

semes tor : 
505 Vol legs 
instructors 

E nd of - 
semester: 
436 colU^a 
iuat rui tora 

Spring 
semester : 
51 college 
Inatructora 



2 ssaee- 23 lt«a Student 1) C did not differ froa Ci 4 C 2 

tare Instructional on end-of-scaester ratlnga (sex, 

teport (Sit) a lib J act area, college 4 teaching 

experience were controlled). 
Bated on pre- 2) 5 of 17 ltcaa showed signifi- 
es t ratlnga, cant improvement in fsvor of leas 
inatructora dl~ favorably discrepant group over 
vlded loco: favorably dlacrepant group and 13 

a) sors favor- of 17 ltaaa Indicated a slaller 
sbly reted trend. 

b) less fsvor.- 3) In terae of chsngee ever tlae, 
sbly rated I received better rstlngs than 

C 2 4 C 3 . 



1) Lxcelicnt 
dlacussioo 
ruling out 
plausible 
hypotheses 

2) SUndoaiza- 
tioa 



High 



1.2 



I'M 



ER?C 



J 



RATINGS 



Author /Data 



fur?<?aa 



Cotiponeute of Design 
C<x& Participant • 



Oration Zrwtruffw/ttatfan 



'Stated Reeulte 



Threete to Validity 

SC I C 



Strength* 



Coaf U*»a 
Rating 



(1979) 



ftndy I : To see- 
ess the cooblned 
eHc4.cs of s»ld- 
ters student rat* 
ln& feedback and 
consultation on 
end-of-term facul- 
ty perforaencft 



S t'idv 2 : To 
check whether the 
results of Study 
1 juat reflect 
dlffetlng group 
expeditions of 
change 



Invest lgat ore clelm 
these studlce ere 
, quesl-expc r la^ntel 

C: R 0 X 0 
C: R 0 0 

X-etodent rating 
feedback end con- 
eultatlon (Includ- 
ing Interview, ob- 
servstlon, end 
videotaping) 



0 XrO 

(This obeervetloe 
iMed aane dete ee 
flret obscrvetlon 
of Study 1) 

X'utudent retlng 
feedback and con- 
sultation (Includ- 
ing interview, ob- 
servation, end 
videotaping) 



31 faculty of 
Unlvcrelty of 
Rhode Island 



20 faculty 
froo Study 1 
who egreed 
to perticl- 
pete (14 of 
Study 1 were 
on leave) 



1 

eemeeter 



1-4 



1) eerly eomee- 

tcr Teaching 
Analyele by 
Studente 
(TABS): Short 
form A i Piirt 1 

2) late ecmee- 
ter TAUS: 
Short fora A, 
r*rt 1 

3) Two li Item 
quest lomulree 
on effective- 
ness of con- 
sultation 
procedure 



Sans ee for 
Study I 



tere 

(depend- 
ing on 
when a 
elmller 
course 
to thet 
of Study 
I w'ae 
sched- 
uled 
egein) 



Study I t The K group lete eemee- 
tcr faculty and student ratings o« 
all 3 coatponanta (Stimulation, 
Organisation, Cveluatlon) were 
more positive than C group. C 
lnetructore Indicated e poeltlve 
ettltude Coward the procedure. 



Study 2 ; Dlfferencee between 
eemeeter I 4 II elgnlflcent for 
11 of the Instructors. 



1) Of 700 In- 
vited, only 31 
egreed to par- 
ticipate 

2) Volunteer 
een.p I e 

3) Sines stu- 
dent ratera 
ware told of 
etudy, reeulte 
Bay reflect/ 
differing ex- 
pectancies of 
change 
(noted by 
lnvcetlgetor) 



Suit N 



Randomisation 



-High 



Invaetlgetar'e 
check for ex- 
pectancy . 
ef fecte 



Erickson & 

Jheehsn 

(1976) 



ERIC 



1) To assess the 
relative ef fecte 
on end-of-tcrm , 
faculty perfor- 
uancc of 

s) student retlng 
feedbjc'tf with 
consu 1 i at i on 

It) student rating 
feedback elonc, 
*> 

c) no feedback 

2) "to assess set- 

Is fact Ion with 
te ithlii£ Impiovs- 
cent process 

3) .o Investigate 

f Acuity 4. student 
attitudes toward 
selves, coursea, 
& teaching 



E2= R 0 X 
C; R 0 



R 0 X| 0 (n-13) 
0 (n-13) 
0 (n-14) 



%\ m full process 
(rating feedback *> 
consultation, lu- 
eluding interview, 
obticrvat Ion *> 
videotaping) 

X2"dlagnoetlc 
(rating feedbeck 
only) 



40 far 
f rot o 
ecedemlc 
department* 



approxi- 
mately 6 
weeke 



1) Teechlng 
Anelyele by 

Studente 
(TABS) 

2) Inetructor 
Quest ionnalre 

3) Student 
Quest ionnalre 

4) Evaluation 
of Teaching 
Clinic (Pert 
I) 

(ell Instru- 
ments designed 
by clinic) 



1) No elgnlflcent dlfferencee 
emong groups 

2) E} feculty were eetlefled with 
teechlng Improvement proceee 



1) Volunteer 
eemple 

2) No lnveetl- 
gstlon of whe- 
ther teaching 
•1:1 1 Is eccnable 
to change would 
effect etudent 
teeming (noted 
by Investigator) 



1) Discussion 
♦f limlts- 
tlons 

2) Randomise - 
tlos 



Hlgm 
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RATINGS 



Author /Date 



?uip<?se 



Coaponente of Deelgn 

Code ttirfcicifante Duration JftuirwMntation 



Stated Eeeulta 



Thrente to Velidlty 
5C X C f 



Weekneeeee 



Strengthe 



Confidence 
tatUg 



f r led lander 
<197i) 



Hoyt 4 
Hows id 
(1976) 



To exealne student 0X00 
percept lona of In* 
etructor change ss 
m result of: 
'«) student rstlng 

feedback to In* 

structore 
b) Instructor' a 

discission with 

dm sbout 

feedback 



The K > n<.»* Slate 
S c u d i > 

-4 etudtes/eurveye 
Co evaluate ef- 
fectiveness of 
Faculty Develop- 
ment Office 
sctlvltlcs 



X-studeot rating 
feedback 4 student- 
teecher dlecueeloo 



2,016 grad- 
uate etu- 
dent a lo §3 
courtee, 
UCLA Graduate 
School of 
Hinegeaent 



i «.uar- 
tar 



1) Mld-Quert<>r 
Couna Evalua- 
tion (MQCE) 

2) End-of- 
qusrter rating 
forei 

3) Experimental 
Questionnaire 



Survr* 1: To In- 
vestigate outcomee 
of contact vllh 
Faculty Develop- 
ment Office 



S u ryey 2: To In* 
wcsttgjte sstle- 
t act Ion with Grad- 
uate leeching 
Assistant Orien- 
tation Workshop 

Stv^cW It To ai- 
Tess the effects 
of student rating 
feedback 4> con- 
eultatlon on sub- 
sequent faculty 
performance 



X 0 

X-contect with Fac» 
ulty Development 
Office 



X 0 

X*Orlantatlon 
Workshop 



0;X 0 
i 

X-student rating 
feeJback 4 con- 
eultetloo 



3gl faculty 



epproal- 
Mtaly 1 
year 



63 graduate 

teaching 

eaelatanta 

(GTA) 



263 faculty 



A gr«ator parcantaga a>f atudanta 
who reported • ■aeoingful 4 help- 
ful dlacuealon of the WE attri- 
buted change In their couree to 
the rfrCE (771), th«n atudente who 
rcpurtcd en I usdc equate discussion 
(30Z), or no dlecueeloo of the 
HQCC elrhough auch discussion wee 
needed (13,61) 



2 or 
•ore 
terms 
between 

1969 4 
1972 



Ueer Eatlafec- 
tlen Faculty 
gurvay 



Survey of 

Orlentatlan 

Workshop 

(1976) 



Student retlng 
for* 



fteepondonta Indicated eatlefec- 
tlon with moat aapecte of aar- 
vleea. While eubatanclal numbers 
becamo Involved at a superflclel 
level (561 tried e new approach), 
only a small number made eerloua 
efforte to improve (131 eought 
help froai office). 

r CTA'e found orientation workahope 
o*>rc helpful In dealing with ad- 
nilnlatrallve detail then In work- 
ing with atudente or faculty 
nemfanra. 

Slgnlf leant Improvement a ahown 
for 13 of 13 meuauree. While ra- 
autia atatlattcally eignlflcant, 
they are not drawatlc In abaoluta 
benau. R*»»ulM conalatent with 
axpeetatloo that voluntary paiti- 
clpatlon In atudent evaluation 
profirsiaa with feedback can help 
faculty Improve loatructlonal 
effectiveness. 



Optional for 
teach* re to fclve 
out farms 4 far 
atudanta ta 
respond. 



Low 



-J 



Volunteer 



Law 



IV 7 



9 

ERLC 



Author/Sate 



furpos* 



Components of Dsslgn 
Cods Participants Vu*ution Instrumntation 



Stated Rssult* 



Thrssts to Validity 
SG X C I 



KAXXNGS 

Strsngtht Confidence 
Ratlnf 



Hoyt fc 
Howard 
(1978) 
continued 



Study 2; 

Fart 1-To eeeoee 
the effects of 
student rating 
feedback fc con- 
sultation or. eub- 
•uqucnt L -olty 
per fonaance 

Fart 2 -To exam- 
ine instructional 
improvement rela- 
tive to contact 
with ofllce 



OX 0 

» 
i 

X-student rating 
feedback fc con- 
au It at Ion 



341 faculty 



2 or 
mors 
terms 
batwean 

1973 4i 
1975 



Student rating 
for* (revised) 



Foatteet mean for "Frogreee on 
R.'.levent Objectlvee" elgnlflcent- 
ly higher than pretest me so. 



Ancova-adjuatad messurss of ef- 
f<tctlvcnaae lncreaaad as a func- 
tion of amount of contact with 
director (14 of 18 meeeuree in- 
creased) . Significant improve- 
ment resulted whan conaultatlva 
eervlcee siada available to moti- 
vated faculty. 



1) Volunteer 
eswple 

2) AKCOVA on 
nonequlvslsnt 
control groups 



Low 



>Uir*h« 
Flcir.er 
» Ihoc-aa 
(;975j 



To e»sesa the ef- 
fects of aid: ens 
i ruder. t rating 
feedback on end* 
oi-tur& course 
evaluations & 
achievement (vel- . 
idity also a»- 
Sifckcd) 



C: R 0 0 X 0 
Ci R 0 0 0 
X - atudant rating 
f aadback 



287 UCLA 
atudenta 
(18 differ- 
ent sections 
involved & 
Instructors 
were gradu- 
ate atudenta 



1 quar- 
ter 



' 1) Frateat de- 
signed to pre- 
dict final exam 
performance 

2) Final exam 

3) 46 item 
evaluation in- 
strument (UCLA 
developed) 

4) short form 
of 4t item 
form 



1) E students had significantly 
higher rusponaaa on summary compsr- 
leon item; on 8 of 46 items and on 
2 (instructor approacliablllty 4> 
value of tha raadlnge) ot 7 evalu- 
ation fee tore. 

2) No algnlflcant dlffarancca be- 
tween groupa in overall student 
performance. 



Student wee unit 
•f enalyele 



Randomisation Fair 



Mckeachla 4 
Lin 

U975e) 



To Investigate the 
effects of discre- 
pancies between 
cid-tcro student 
ratings & faculty 
self-ratings of 
expected & Ideal 
te.ichlnfi perfor- 
mance on iaculty 
performance 



0X0 

X*etudent rating 
feedback 



28 instruc- 
tors of in- 
troductory 
psychology 
Clsssus At 
University 
of Hlchlgsn 



semester 



32 item Michi- 
gan Student 
Perception of 
Teaching form 

Based on mid- 
term student 
fsculty rat- 
ings, Instruc- 
tors wsrs di- 
vided into 8 
groups 



1) Significant dlffarencss for 2 
(group Interaction • feedback) of 
7 dimensions were found for thoss 
whose expected m ldesl rstlngs 
were higher than student rstlngs. 

2) Ths group rsted mors highly by 
students thsn by themselves 
changed in s negative direction 
(on feedbeck only). 



a a • 

b b • 

a e c 
4 



1) Uncleer text 

2) Unwarranted, 
conclusions 



Low 



ERIC 



> 

• 



I 



RATINGS 

Author/Date furppa* Component a of Design Stated laaulcs Threat! CO Validity Vaakoaaaaa Strengths Confidence 

Cods Ptu*tioipant§ Duration Inttrunqntation SC I G I Rating 



V.cKcachla 4 
Lia 

(1475b) 



To e*»*se the re- 
lative clfccta on 
end-ot-taro facul- 
ty 4 student per- 
formance of 

a) aid-tern stu- 
dent rating feed- 
back combined 
with consultation 

b) student rating 
feedback slons 4> 

c) no feedback 



R 0 X l 
0 X, 



R 0 



Xj»atudent rating 
feedback 4 conaul- 
tatlon 

X 2 a iCudent rating 
feedback 



37 graduate 
asalatanta 4 

3 faculty" 
teaching In* 
troductory 
psychology 
courses at 
Uulverslty 
of Michigan 



14 weak 
tarsi 



1) 32 Item 
Michifcu , Stu- 
dent Percep- 
tion of Teach- 
ing 4 Learning 
(McKeuclile 4 
Lin) 

2) Selected 

it erne froa In- 
t roductory 
Paycuology 
Criteria (Hll- 
hol land 

3) Attitude Co- 
ward Psycho- p, 
logy question* 
naire 

A) Attitude Co- 
ward self 
queetlonnslrs 

J) AttlCuds to- 
ward Mental 
Illness quaa- 
Clonnslrs 

6) Curiosity 
Tesc 



1) Significant difforencea in fa- 
vor of Ej group for both general 
Coaching ef fcctlvencee 4 overall 
vnlufi of couraa 4 for 1 (lsipsct 
on students) of 7 dimensions. 

2) E. was slgnlflcsntly higher In 
student achievement for I set of 
psychology classes us tneaeured by 
Criteria Test 4 for aeecure of 
Curiosity in another sst of 
classes. 

3) Among groups Initially rated 
low, nedlua or high, no algnlfi- 
cant dlffcrencaa on final cri- 
terion laeasures. • 



1) unclaar text 

2) unwarranted 
conclue lone 



ftandosiiiatlon 



Fair 



111 



lx » 



O 

ERIC 



Author/Doto 



lutpose 



Coojponsnte of Design 

Cods Participant* 



Duration IntWumtntation 



Ststsd Results 



Threeta co Validity 
SG I C f 



RATINGS 

Strength* 



Confidence 
SUc lag 



MUUr I) To tho 

(1WI) affects of *ld- 

terra Otudent rat- 
in* feedback on 
*nd-ef-iera faC- 
uUy 4 ttt'dene 
ptr for:->nc« 
2) To Investigate 
instructor atti- 
tudes toward 
vslue of student 
rating* 



E: 10X00 
C: 10 0 0 

X-student rstlng 
feedbeck 



36 teaching 
•••1 stent* 
(TAs) teach- 
ing coureee 
in religion 
or earth 
•clence 

(spproxl- 
pataly 2000 
•tudente 
involved) 



i 

**■ 



1) Survey of 
Student Opin- 
ion of Teach- 
ing (SSOT) 

2) Instructor 
Attitude Quaa- 
cloimelre 
baaed on first 
10 items of 
SSOT 

3) Student 
schluvewent on 
■id-term 4 
final 



1) Instructors in faadback 4 atti- 
tude groups did not dlffar signi- 
ficantly on and-of-tarsj ratlnga. 

2) In 2 of 3 couraaa, no signifi- 
cant differences on flnel exeei 
scores for feedbeck or ettltude 
group*. 

)) In 3rd couraa, elgnlflcent dif- 
ference on achlaveaant in f*vor 
of feodback condition (»<<01) 



Us* of instruc- 
tor* ss unit *f 
*n*ly*l* »*y 
h*v* r**ult*d in 
••tiling error* 
du* to ssvall n 
psr c*ll 
(data combined 
for aactlona) 
(notad by 
lnvaatlgator) 



1) Ranrfools*- 
tlon 

2) ANCOVA 
analysis 



High 



Daaed on In- 
structor Atti- 
tude Qucetlon- 
nalre, Instruc- 
tors divided 
into: 

1) Feedback/Fa- 
vorable Attl- 
rudce 

2) FecJbnck/Un- 
fevora'ole At- 
titudes 

3) No Feedbeck/ 
Fevoreble 
Axtltudee 

4) No Feedbeck/ 
Unfavorable 
Attltudaa 



Murphy 4 
(IWo) 



1) To s*?ess tha 
relative effecte 
on (acuity per- 
forT.ar.cc of 

e ) atucent rating 
feedback com- 
bined vtth con- 
sultation 

b) student rating 
feedback elone 4 

c) no feedback 

2) To invest leat* 
a probleo* solving 
approach to uti- 
lizing feedback 



F.i : R 0 X t 0 
Ei: XOXjO 
0: R 0 0 



Xi»atudcnt rating 
feedback 4 consul- 
tetlon 

(sugmentod feedbeck 
utilising non- 
expert coneultente) 

X2-studcnt retlng 
ieedback 

(eimpla feedback) 



70 faculty 
et Univer- 
sity of 
Texee (eech 
rendonly ee- 
lected fro* 
pool of po- 
tential 
subjects) 



IS week 
•oaeatar 



Adapted for*) of 
Course Instruc- 
tor Survey: 
Gene re 1 Ques- 
tionnaire (de- 
veloped et Uni- 
versity of 
Texes) 



1) Ei not eHjnlficently different 
frost E 2 in l*»provo**nt of retlnge. 

2) Zi 4 %2 *hoved wore gain (sta- 
tistically) thsn C. 

3) Instructor* rscclvlng fssdbsck 
did not utilise feedbeck in ltaa- 
by-lt**i problesi-aoivlng approach. 



f c s 1) Although JUndosUaatlon 
g b atatiatlcally 

c significant 

dlffcrar.ee in 

gains between 

feedback 4 no 

feedback con- 
ditions, geln 

vee eaell in 

ebeolute eene« 

(noted by 

investigator) 

2) Chsngs *cor* 

analysi*-- 

•hould ANCOVA 
. hav* b*en u*edf 



fair 



ERLC 



112 



•4 4 (*% 



> 

I 




Auther/Oete 



fuCpose 



Covponsnts ef Design 

Cock ftzrtfoipant* ftuvt^n JrwtrwiitritattOfi 



Stated tcaulto 



Thrsoto CO Tolldlty 

SC I c s 



Vosknooaao 



RATINGS 

Strength* 



Coef ldence 
teeing 



0U» 4 

Lcncoakl 
(W»3) 



To asses* the af- 
fects of student 
evaluations oa 
faculty self* 
ratings 



0X0 
0 0 



X-studcnt rating 
fesdbsck 



24 Inst rue t- 
ora In a 
graduata 

school of 
oducetion 



epproxl- 
atatsly 2 
waska 



1) 12 ltaai In- i) C group' a coot-rocoot cerroie- 
otructor aval- Clon coafflclanc vat .12 whlla I 
uatlon fans group's corrals t ion waa .54. 

2) 12 ltcs aCu- 2) Chi squara test on total nu«- 
danC avalua- nor of change* was significant 
Clon tons «C .001. \ 

3) Instructor solf-rotlng cxangoa 
noC always la direction af stu- 
daot raclnga 



1) Honshu lvsl sat 
contral group) 
dcslfu 

2) Reliability af 
swaauraa not 
rsportod 



Overall 4 To assess tha af- 
Marsh facts of student 

(1174) fating faedbock on 

faculty 4 student 
performance 4 Co 
assess affcctlvo 
consequences of 
such a procedura 
(application of 
subject matter 4 
plans to pursua 
subject furthar) 



10010 

too 0 



X-student racing 
feedback 



993 UCLA 
undcrgred- 
uotes who 
completed as 
Introductory 
course In 
conputor 
programing 
during Fall, 
Winter, or 
Spring 1973- 
74 



quortors 



1) rrccaac ca> 
predict final 
axan perfor- 
ata nco 

2) Evaluation 
of Instruct loa 
f rograat. quoo- 
Clonnalro 

•7 dimensions 
of teaching 

•quest lona oa 
effective 
conscquoncco 

3) Final axan 



1) Significant dlfforuacao U 
favor of E for 2 suessary Uama 
(overall racing of instructor, 4 
ot course), for percalvad dif- 
ference In Instructional quality 
and for 4 (concern, learning, 

. InCsracClon, 4 exestl nations) ef 
7 dimensions. 

2) B significantly hlghar on oxaa 
pcrforsunca. 

3) E gave atora fovoroblo re- 
sponses co affccClva conse- 
quence Items, £ significantly 
hlglmr on 3 of 5 Uesis. 



Unlc af aaily- 
ala*atwdont 



1) Xnvaadga- 
clon ci sf- 
fecclva 
certs t que neat 

2) Hor.donUo- 
Clon 

3) ASC0VA 
analyala 



Fair 



paabooklau 

iW74) 



To investigate Che 
effects of discre- 
pancies between 
nld'tero studenC 4 
faculty aeif- 
ratings on end-of- 
terx performance 



0X0 

X-student redng 
feedback 



13 leeching 
fellows 
leeching 
p eye ho logy 
#t Univer- 
sity of 
Michigan 



1 

aesMeter 

(approx- 
imately 

14 

we eke) 



21 Uens fross 
Student Opinion 
Qucatlonnalra 
(S0Q) ravlaed 
by HcKeachle- 
Lln 

gasud on pro* 
test ratings* 
eubjucte di- 
vided Into: 

a) store favor- 
ably rated (f) 
(n-7) 

b) ssoro s«oder- 
etcly raced 
00 (n-3) 

c) store unfa- 
vorably rated 
(U) (o-3) 



1) glgnlflcent differences essong 
groups on rapport 4 atrong 
trends on skill (F-3.23, df- 
2/1. p<.0t), ovarload (F-3.5S, 
dl-2/10, p<<07), 4 intersctloa 

-3.24, df-2/10, p<.0») 
, Individual *t* tests to cos- 
pare gain ocorea between groups: 
«) ftetwueit F 4 H, significant 
dlffeicncus on skill, Inter- 
action. 4 rapport In favor of M 

b) Between 0 4 H» no significant 
dldeivno * 5 

c) tetueon M 4 F, significant 
differences on ovsrall valua af 
courso In favor of M 

d) Trende In favor of H over U 
la repport and toward laaa 
work ovarload. 



2) 



d b 

t c 

h 



1) Small N 
2} Individual 
•t* taata uaad 
ta furthar In- 
vestigate ne 
significant 
alffsr ncs In 
finding s using 
A NOVA (fishing 
4 error rete 
probleuj 
3) Change ecore 
enalye Is. 



Uw 



115 



o 

ERIC 



RATINGS 



Author/Date 



Coapeneute ef Deelga 
Cods Participant* 



Duration Instrumentation 



'tested Reeulte 



Threete to VilUlty 
SC I C I 



UnkMiiii 



Streng the 



Cant Idence 
feting 



raabooklan To Investigate tha 0X0 13 tcechleg 

11*74) effects af Altera- fallowa 

Mncln between X-studsnt retleg teaching U- 

etd-tena studsnt 4 feedbeck troductory 4 

faculty self- educetlonel 
retlngs on eed-ef- peycholegy 
tcra faculty 
performance 



I cam 



21 Iccai f rem 
gtudant Opinio* 
Queetlcnnelre 
(SOQ) rcvleed 
by Hcftteechle- 
LU 

Baaed on pre- 
taat retlnge, 
subjects were 
divided Intel 
a) unfavorably 

discrepant 

(UD) <n-2) 
») einlaully 

dlacrapant 

(MI>) (n-2) 
c) favorably 

dlacrapant 

<H» (»-*) 



t) Dlffcrancee eeong groupa 
•kill <p*.2), feedbeck <p<.3),4 
reppert (pi.Ol). 

2) Individual *t* caata ta eoopere 
gain acoraa between groupes 

a) II algolf Icently changed sore 
on chill, faedbeck, rapport » 
general taachlng ability 4 over* 
all valua ef couraa thea FD. 

b) MD improved elgnltlcant ly on 
rapport coaparsd to FD 4 shoved 
etrong trenda In aaee direction 
on skill. 

c) Laaat galea eedc by F0« 



1) ImII U 

2) Individual 
•t* taata uaad 
to furthar la- 
vaatlgata ea 
elgi.lflcent 
finding using 
A NOVA ((Uhlng 
4 arror rata 

p rob lea) 

3) Change acore 
analyjla 



DUcuif too «f 
Ualtatlona 



Low 



Rote* To aaaaaa tha af- 

facta of ntd-tera 
student rating! 4 
faculty self- 
ratings ol actual 
4 desirable taach- 
lng pcrioraanca oe 
end -of -tar* (acui- 
ty performance. 



1010 
10 0 
R 0 

X-studsnt rating 
faadback 



31 Instruc- 
tora at Unl- 
varalty af 
California 
at Santa 
larbare 
(2,9*0 etu- 
danta 
lavolvad) 



I ten 



gtudiint rating 
fona af » Itaee 
solectcd f*spV 
a aat dtscrieod 
by Isaacson, 
HcKceclila. Mil- 
hollend, Lie, 
llofellcr, Reer- 
weldt, 4 Zlna. 
4 Pec tore 1 

a) ovarload 

b) organlaatlo* 

c) feedback 

d) Interaction 
a) rapport 

f) skill 



1) On atudant ratinga 4 Inetruc- 
tor retlege, ne alga I Meant dlf- 
farancaa batwaan group scene ea 
a rcault ef feedback er prior 
experience with pretest. 

2) No functlonel relationship be- 
tween retlng dlecrepenctee 4 
poatteet ratinga. 



1) Volunteer I) Excellent 



High 



eaaple 

2) Short tlcia • 
interval bet- 
ween pre 4 
po.tteata due 
to quarter 
eyetca 



discussion 

2) Good design 

3) Manned co»- 
parlso.1 con- 
trasts 

4) Multiple re- 
grets ten *nal- 
ysla lor die- 
crepanclca 

J) Raodcclsa- 
tlon 



in 
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8 



RATINGS 

— ■ ■ - ■ 

. „ t Itated ftaeulte Threata Ct Validity Vaokaeeeee Itr*e*the Confident* 



5h*r=*n 
(1977-71) 



Ta eeeeee tha ef- 
fect e of atudent 

rasing feedback 
('(orutlvi* feed- 
tack) on eubaa- 
quent (acuity 
performance 



5 2 

rt n 

3 3 

n n 
n rr 
• • O 

mI r 

o 1 o 
o* o 

M r 

o o 
I 

I" 

l o 



Instructor 1 
■35 etudanCa 
(health 
claaa) 

Instructor 2 
■23 atudenca 
(educatfonal 
paychology • 
claaa) 



Student rating 
for* cane let lag 
af: 

a) valua of 
Instruction 

b) quality of 
lnatructlon 

c) axplenetlen 
of ratlnga 



1) For Inatructar 1, algnlflcent 
dlffarancaa batwaaa baaalloa 4 
rilS on valua of lnatructlon. 

2) For loiitructor 2, algnlflcant 
difference* between baaellna 4 
FilS on quality of laatructlon. 



1) Cueuletlve 
affact af 
Craatnaata 

2) Unclear Cast 



Xi*Feedback Low 
Specificity {TIB) 

X2»Feedback High 
Spaclflclty (FHS) 



TucVua 4 
Ollvar 

(1969) 



To aascea cha ral- 
atlva effecta on 
faculty perfor- 
mance oi: 
•) student rat- 
ing feedback, 

b) superviaor 
feedback, 

c) atudent rat* 
log 4 *upervla«r 

feedback. 4> 
4) no feedback 



11 



IOX, 0 



1 

R 0 X 2 0 
I0X.0 
R 0 J 0 



X» -Student rating 
feedback 

X.-SupervUor 
feedback 

X 3 -Studer.t retlng 4 
eupervleor feedback 



28ft taachara 1 a«n*a- 
of vocation- tar (12 
al aubjecta vaaka) 

at high 
■chool or 
technical 
level 

IS addition* 
al taachara 
In poittaet* 

only 

condition 



Student Opinion 
Queatlonnalra 
(SOQ) davalopad 
by Iryan 



1) % x 4 Ej ehewed algnlf lcantly 
graatar change than t% 4 Ct. 

2) Ei 4 E, vera atatlatlcally com- 
parable Indicating a failure for 
eupcrvlaor feedback to generate 
any change beyond that accounted 
for by atudent feedback alona. 

3) E 2 produced a significant ty 
greater negative ahlft (that la. 
oppoalta to feedback racoanenda- 
tlone) than C t . 

A) C2 served to rule out teatlng 
effecta. 



Change acara 
enelyele— ahauW 
ANCOVA have been 
uaed? 



1) Excellent 
dlecueelao 

2) large N 

3) Teacher 
veers of ex- 
perience vaa 
controlled 

4) Rando«iaa- 
tlon 



110 



9 

ERIC 



Author /Due 



furpoee 



Components of Design 

Cods Participant 



Duration Instrumentation 



.stated tasulta 



Thr«tti to Validity 
SC I c s 



VllkMIIM 



RATINGS 

.Itrangtho Confidence, 



Voic ii To a»ecee cbo sf- 

Lasher fecta of studsnt 

(1973) rat in* feedback oo 

aubaec,uent faculty 
psrforaance 



o I o 

x t x 

o I o 

x i x 

O 1 o 

X , X 

0,0 

X t x 

o , o 

X i X 
O 1 o 



X*studcnt rating 
feedback 



Croup A; 30 
Instructore 
who vara 
member* of 
Bowling 
Green Col- 
lege of 
But Inesi 
Atiialnlstra- 
tlon at clme 
that manda- 
tory student 
tva luac ion 
system in- 
t reduced, 
Winter 
1909-70 
(22,141 atu- 
denta In 
1000 

courses) 

Group B: 13 
1 nut rue tor a 
who Joined 
Howling 
dreen Col- 
lege pf 
Bus i nes a 
Admlnlstre- 
tlnn in 
September, 
1970 eftar 
Introduction 
of mandatory 
student 
evaluation 
system 
(4317 ecu- 
denta in 193 
couraca) 



quartan 



Bowling Craan 
evaluation form 
(open-ended 
quaatlona 4 
etodent align- 
ment of grade 
aa Index of 
.teaching 
performance) 



Regrettlon coefficients of regroe- 4 b 

alon equation! not slgnlf Icent-- • c 

etu'tent rating feedback 414 not f 4 

raault In Improved faculty g 
performance. 



loir 



I'm 



9 
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Author/Date 



turpoee 



Coaponanta of Dealgn . 

Cods tortioipant* Duration Instrumentation 



Stated Meulta 



Tbeaate to Validity 
SC I c t 



Vaakaaaaaa 



RATINGS 



Strensthe Coafideoc* 
A-tlng 



Vccrta To a • acta tha c©»- 

(197t) binad effecte of 

■Id-tern atudent 
racing feedback * 
consultation OA 
end-of-ttrm facul- 
ty p»irfor»ence 



Z X \ ROX.O 
Cs a 0 

X^-atudcnt rating 
faadbeck 

X2">tudenc rating 
faadback with con- 
aultetlon 4 atudant- 
lnatructor dialogue 



54 claaaca 

In Rhetoric 
program at 
Unlvcrelty 
of Iowa 
(3 full- 
tlna fac- 
ulty 4 51 
graduata 
TAa) 



1 echool 21 Item Student 1) Ho algnlflcant dlffarancaa a* 
tarai Parccptlona of evong 3 groupa froai *la-tar« to 
Teaching, fona and -of -tana (• 2 factor ANOVA 
(SPOT) (Whitney vlth rapeetcd ncaaurca on 1 fac- 
4 Ucorta) tor vae uacd 4 an alpha laval of 

(ltcna choaan p<.001 uacd pociuae of 2S a a pa- 
tron a pooj. of rate analyace). 
Items) 2) No algnlflcant dlffaroncee a- 

aong 3 groupa at and-of-tam. 
X 3) Tor 20 of 26 ltaaa, E 2 had 

V higher ratlnga than C. 

gtatlatlcally, chancaa for thla 
occurring lcaa than 5%. 
A) For 23 of 2S ltcae. t % had 
higher raclnga than C, 
gtetlatlcally, chancaa for thla 
occurring p<«00l» 



tapaatad nea- 
auraa ANOVA 
vlth rapaatad 
•taauraa on 1 

factor— ahou Id 
MAKOVA ar 
MAKCOVA baaa 
uaod? 



satlea low 



l ~ - 
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PROTOCOLS 



Author /Date 



futpOee 



Component a of DeaJgn 
Cod* Participant* Duration Inotrvmontation 



Stated M^ult* 



Tfcreete to Validity 
SC J G S 



Weaknaeeaa 



Scrcnzcha 



Canf :>nc* 
Rating 



tors' To aaaeae the ef- 

(1175) Uct9 of tea: her 

language protocol 
modules on teach- 
er skill ecc,ul*l- 
tion U change, 4 

CD fcluJclU 

performance 



E: 0 X 0 (n-25) 
C: 0 0 (n-15) 

X -protocol training 
In teacher ienguege 
protocol aiodulee 



40 fourth, 

fifth, 4 

elxth gratia 

ln-servlca 

elementary 

school 

teechere 



approxi- 
mately 7 
weaka 



1) OUacrvotlon 
fona to rocord 
12 teaching 
bcUavlora 
(Multiple 
qucutlone, de- 
fining, vague 
worde, general 
pralae, epecl- 
flc pralee, 
uoe of atudont 
Ideas, voice 
awdulatlon, 
pai;iphrnalng, 
cucln^, open- 
ing review, 
terminal 
etructure, 
summary 
review) 

2) Observer 
retinae of 10 
teacher char- 
actcrlatlce 

3) 2 achieve- 
ment teat* 

4) SKA Short 
Teat of Educe- 
catlonel Abi- 
lity, level 3 

5) Warner, 
Keeker 6 Eella 
Rcvleed Occu- 
pational Rat- 
ing Scale 



1) B nade elgnlflcant galiie on all 
12 tonclilng bcUavlora while C 
■ado elgnlflcant galne on 3 of 12» 

■ R elgnlflcanrly exceoded C on 4 
of 12 beheviore, 

2) When pupil echolaetlc ability, 
pa font a* occupation 4 toechor co- 
verage of unite 9 contont were 
pnrtlullcd out, teacher' a uee of 
dol'lnlng, voice artdulotlon, para- 
phraelng 4 cueing were signifi- 
cantly related to pupil achleve- 
siont on 2 measures, & teachor'e 
uee ol" opening review 4 terminal 
atructuru were tlgnl t lcantly re- 
lated to 1 achievement mcueure 
(acroce all tubjecte)* 

3) No significant relatlonehlpe 
ahown between teacher character- 
laclca 4 pupil och lavement, 



ANCOVA on non- 
aquivalant con- 
trol groupe 



Stan-Urd con- 
tent unit for 
final 

obaervj; too 



Low 



125 
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*»tnor/D*te 



turpeee 



CoapoMaci af Design 

Cods Tartioiymt* 



Duration Xtotrummtation 



Stated JUsuita 



TitrMti ta Validity 
5C X C f 



VllkMlMl 



PROTOCOLS 



Strengths 



Coaf idaec* 
ftfttlrg 



tort 



To aaaeua the «f- 
fecta of claasrooei 

rur.ae,c*!cnt proto- 
col ccculci L pu- 
pil at If -concept 
protocol codulea 
on teacher skill 
acquisition 4 on 
pupil balvsvlor 



I 0 X| 0 
I 0 x 2 o 



X^'protocol training 
In claaarooa nanage- 
•ant Mdulaa 

X2»protocol training 
la pupil eclf- 
cancapt aodulae 



28 in- approxl- 1) Observation 1) B teacher* avuia afgnlf lcantly 

acrvlca ale- stately • of taachar * greater i*pra\eownt la 7 af 13 
•entary vteke pupil behrtviere than C teachara. 

gchool bchavlora 2) Tor recitation altiiatlooa, I 

teachara 2) North York puplla allowed no algnlficaot 

Self-Concept change In nark Involvcncnt but 
Inventory algniflcant reduction In both 

3) plcva-llerrle *ll<lly deviant fc seriously de- 
Chlldi«Va vlaut behavior. C puplla ahowed 

Self-Concept a significant taduetloa In da- 
Scale finitely off-teak behavior but no 

other significant changca. 

3) Fur scitwork sltuatlona, E pu- 
pil r. dtowml significant reduc- 
tions in fctlUly deviant fc eer- 
ioualy (lev l Ant behavior. No alg- 
nlUcant chengee fur C puplla. 

4) C taachera received aignlfl- 
cantly nora favcrcble poet acoraa 
on 11 of 12 aalf-conccpt 
bchavlora. 

3) No algnlficaot la* rev ana nt in 
pupil aalf-concapt for B or C. 



Unit af eaeiy- 
ala far ealf* 
concept changee • 
• claaarooa) 



Excellent 
dlafcuaaleo 



Nigh 



gorg. To assess the ef- 

I.anger. t feces of classroo* 
"Jit son icanagcmcnt proto- 

v 1*7 5> col «odul<a on 

teacher skill ac- 
quis it Ion li on 
pupil behavior 



E: 0 X 0 (n-20) . 

C; 0 0 (n-*) 

X-protocol training 
In claaarooa oen- 
agenant nodulaa - 



2* ln- 
aervlce ele- 
nentary 
achool 
teachers 
(cont rol 
aubjecta 
drawn fron 
a ana achool 
aa experi- 
mental 
aubjscts) 



approxl- 
aataly 
10 weeks 



1) obaarvetlon 
fans to record 
classrooai nan- 
egement 
behaviors 

2) rra » poet 
pupil observe* 
tlona of 5 pu- 
pil behaviors 
(definitely 
Involved In 
class work, 
probably in- 
volved, defin- 
itely off 
task, nlldly 
deviant, eer- 
loualy 
deviant) 



1) t teachcra received nor a favor- 
able post ratlnga on all 13 bs- 
havlors but dlffercncaa gener- 
ally snail 4 uonalgTilficant. 

2) For recitation situations, B- 
pupils* work Involveaient signi- 
ficantly lncreaaedfc deviant be- 
havtnr algnlf lcantly dacraaaad. 

3) For saatwork situations, E pu* 
pile 1 work Involveoent elgnlfl- 
cuntly Incroaaad but no algnifl* 
cant changee for deviant 
behavior. 



a 4 

i S 

h 



Use af ANO0V* Discussion af 

vlth non- plaualblc al- 

aa.ulvslc.nc can* tcrtutlva 

trel groupa aaplaaatloaa 



1 0~*i 




PROTOCOLS 



Author/Date 



furpoa* 



Coupons nts of Deelgn 

Cods participant* 



Duration tn*trm*ntation 



Ststed Results 



Threats to Vslldlty 
SC I c t 



Strtngrhs 



Conf ldssce 
ftttftsg 



lorg 4 Part 1: To essees 

Scant ths effects of 2 

(If 74) teacher lenguage 

protocol codulee 
on teacher eklll 
ecquleltlon 4 
change 



0 Xj 0 

/."protocol training 
In 2 tea.cher lan- 
guage protocol*, 
encouregeuent 4 
extcnelon 



Part 2: To coapere 
the affect* of the 
protocol nodule 
vlth tl»e oint- 
course aodel la 
changing teacher 
behavior 



19 ln- 

aervlce ele- 
rcntsry 
echool 
teachere 



E|: 0 X l 0 

t 2 Vox 2 *o* 

Xi«encouregeaant 4 
extcnelon protocol 
training 

X2*Mlnlcoureee 14 2 
(general prelee* 
specific prelee, uee 
of student Ideas, 
prompting, seeking 
further clarifica- 
tion, re focusing, 
redirection) 



3 hour* 
extended 
over one 
week for 
eech 
protocol 
Module. 
Total; 
10 houre 



protocol • 
19 elemen- 
tary school 
teachere 

Mlnlcouree X 
- 46 lnter- 
. Mediate sls- 
■untary 
echool 
teachere 

Hint cou rae 2 
• ? number 
of kinder- 
garden 
teechcre 



Rating* of au- 
diotape* on 7 
apecific 
behaviors: 
•) general 

prelee 
») epeclflc 

prelee 

c) uee of stu- 
dent ldeee 

d) prompting 
a) seeking 

further 
clarification 

f) rofocuelng 

g) rsdlrsctlua 



protocol 
- 10 
hour* 

H*nl - 
coureee 
14 2-1 



fart U Teachere suds elgnlflcant 
galns'on ell but 2 bshsvlors, gen- 
erel prelee 4 redirection. 



fart 2i %i 4 Ej condition* brought 
ohout'slnllsr galoa oa nost ke- 
havlora coapared. 



1) Volunteer 
can? Is 



Xssctlvs ef- 
face* of. test- 
ing were 
Controlled 



- Low 



1) Volunteer 
eaapla for 
protocol* 

2) Chang* ecere 
•oa lysis 



Clcleeaea Fart I: To aeeeee E: 0 X 0 



4 Pugh 
(1174) 



ERJC 



the tffecta of 
protocol fllaa on 
teacher cor.ccpt 
acqulslt ton 



Part 2: To aeeeee 
characteristics 4 
react lone to uee 
of protocol files 
aerlte 



X 0 

X-tralnlng with 
teacher-pupil in- 
teraction protocole 



X 0 

X-trslning with 
teacher - pupl 1 In- 
teraction protocole 
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69 masters 
students en- 
rolled In 
educational 
psychology 
course 



1) cleeeoe 

taught by 
14 Instruct- 
ors 

-294 under- 
graduates, 
graduates, 
pra-cervica, 
in-service 4 
echool ad- 
Minlatretore 



4-1 

hours of 
Class- 
room in- 
struc- 
tion 
over e 
2-3 
week 
period 

1-4 

claee 

perlode 



Categorising 
Teecher nehey- 
ior test, For* 
ft 



tlgnlflcsnt gslns in concept ec- 
sulaltlon •* e reeult of uee of 
chic protocol eerlce. 



1) Instructor 
questionnaire 

2) Student 
retlng seals 



1) Inet rue tore fevorekly received 
protocol training. 

2) Pupils of Instructors favor- 
ebly received uee of protocol 
trelnlng. 



1) Ho reel con- 
trol group 
(noted by 
Invest lgstar) 

2) Difficulty 
in following 
ts*c 



Fslr 
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Author /Dote 



?U(P0Se 



CnmmMi sf Desigs 

Cod* Partioiponte DuratUn JhBtnmtntatU^ 



Itsted Xseglti 



Tsrests to VtlUlty 
4C I C t 



VMklllMI 



PROTOCOLS 
Strsugtns 



Ceaf Mens* 

ftsting 



Clslsssun 

<i«3a> - 



Tn investigate the 
effect of protocol 
f lists or contrast- 
ing icructure 
(M,h k low) on 
tcac ur ccncept 
ecquiftltlon * 
teecher reactions 
to us* of films' 
treatment 



rrs-ssrvlce 
k ln-eorvlce 
tsschsrs ti- 
ro Hod to 
graduats lo- 
ve 1 sducs- 
t tonal psy- 
chology 
courts 



1) Cstsgsrls- 
lng Toschlng 
ftehevior tost 

2) Uksrt-type 
It s«s frosi 
ovoluot ton 
e«sls(ts ss- 
soss rsectloee) 



1) Ronoocilzs- 
tloa 

2) Design 

3) Statistic*! 
analysis 



High 



ERLC 



Stu-'y 



I: To c oa- 
ths t fleets 



far 

Of k low 

structure fliiis 

Study 2: To es- 
Stfs* ths intcrec- 
tlvs effects of 
us ln« both typos 
of fllas lo s 
tlogl* trololug 
group 



Study 3: To aa- 
s*»s let effect of 
a \ arias ion tnat 
caar^cd during 
first two otudlos 



E 2 : 



ft 0 X. 0 
ft 0 X, 0 



Ep ft 0 X« 0 
E 2 : ft 0 X 2 0 
E 3 : ft 0 Xj 0 



*2 



R 0 X| 0 
ft 0 X2 0 



r-20 



M-30 



i dsy 



2 4sys 



Xi-protocoi trsio- 
lng using high 
structure film 

X2 "Protocol t rein- 
ing using low * 
structure fllsi 



¥-20 



2 soys 



frtwfy It Signifies!* $«iM i« «••- 
cspt acquisition for t| * I* tot 
no significant dlffsrsncss letwssn 

the two. 

Study 2: 

1) Signifies** differences is con- 
cept ecquisitien between E lt Ej 

& Ej. 

2) Coteperioon of noons rsvsslsd 
slgr.~flcsntly grsstsr concept ac- 
quisition for Ej thsn El. 

3) Signlficont lncrseees in csn- 
ccpt scs,ulsltlsn fsr sll groups. 

Study 3: 

1) No significant differences be-,- 
twrcn til k E2 <or concept 
acquisition. 

2) Signlficont lncrsssss is csn- 
ccpt scqulsltloe for both | roups. 

3) E| hod elgnlficently nors fs* • 
vorabls rssctisns to files then 



X^-protocol trsln- 
lng using high/low 
etructure fibs 
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Author /Date 



Ntfoae 



Goaponeate af Design 

Cods Participants Duration Xnatnmmntation 



Stated ftaeults 



PROTOCOLS 



Threate to Validity 
SC X C t 



VnbMin 



$trcn*;tke 



Confidence 
lUtleg 



Cleiaaaaa 

4 rut* 

(With) 



Ta asstss the rel- 
etlvt effaced af 
different le- 
etruct lonai 
treatments on ceo- 

C«?t ecquUltlea 



Study. 11 

lit ft X| 0 <e-iQ) 

Eg: ft X 2 0 (n-ll) 

Ej; ft X ? 0_ (n-U) 

StuQV 2: 

E«: ft X 5 0 <e-lQ> 

Ej: ft X, 0 <n-*> 



XfConctpt MMl, 
definition^ 4 
£ I Iced 

exempli fleet Ion 



44 students 
la a gredu- 
ete elemen- 
tary educa- 
tion courts 

It students 
la a gr Adu- 
st! educa- 
tional pay 
c ho logy 



1 day? 



1 day! 



Categorising 
Teacher %a» 
havlor test 



Cacegarlalag 
Teacher ta* 
havlor test 



Study £i Significant dlffereaces 
among the combined I| 4 E 2 greups, 
E3 6 E4. Ej 4 E^ group meaea hath 
significantly lover thaa aean ef 
combined Ej 4 Eg group*. 



Study j: IV group meen greater 
(p-.QIl dircctleael) thaa E 4 



Significance le- 
vel af p-.OIl 



Fair 



X 2 -Conccpt nemee 4 
ttt i nit lone 

Xj-Canctpt na«es ea 
film teat 

X^-unttructurtd 
viewing of protocel 
films followed hy 
concept names on 
film teat 



X^-Concept namee, 
definitions. 4 
filmed exempllflce- 
tlon 4 dlract 
"cute" to loateoces 
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PROTOCOLS 



Author/Date f«r?9it Component e of Dealt* |tet«4 ItiuUi Threat e to Validity ViikoiHM Strengths Confidence 

Codo ffcrttijipantf ftuutton Jhitmoimeatittt SC I C S ftatloj 



Clslsssan» 

ru^n, 4 

Heist 
<197ta> 



To ••teat the ef- 
fects of protocol 
Clint on tcschsr 
concept 4 skill 
tenuis 1 t Ion 



S: 



Xi 0 x 2 0 

<X) o x 2 ~ V 



Xpprotocol train- 
ing In teacher- 
pupil Interaction 

(K)-lnstr action in 
Individual etudsnt 
counts ling 

X^-lntc ruction la 
principles of con- 
cept teaching 



20 in- 4 da ye I) Cttegorlt- I) For concept scqulsltlon, E had 

ssrvlco ing Tssclisr slgnlf lesntly grsstor stssn scorns 

teschsrs sn- Ichsvlor tsst on totsi, prosing 4 Informing, 

rolled In 2) Frequsncy then C, 

tustsrs Is- counts of spo- 2) I hod slgnlf lesntly grsstor 
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Appendix B: Threats to Validity* 



Threats to Statistical Conclusion Validity 

a) Low Statistical Power 

b) Violated Assumptions of Statistical Tests 

c) * Fishing and the Error Rate Problem 

d) Reliability of Measures 

e) Reliability of Treatment Implementation 

f) Random Irrelevancies in the Experimental Setting 

g) Random Heterogeneity of Respondents 



Threats to Internal Validity 

a) History 

b) Maturation 

c) Testing 

d) Statistical Regression 

e) Selection 

f) Mortality 

g) Interaction of Selection and History 

h) Interaction of Selection and Maturation 

i) Interaction of Selection and Instrumentation 
j) Resentful Demoralization 

k) Diffusion or Imitation of Treatments 

1) Compensatory Rivalry 



Threats to Construct Validity 

a) Inadequate Preoperational Explication of Constructs 

b) Mono-Operation Bias 

c) Mono-Method Bias 

d) Evaluation Apprehension 

e) Experimenter Expectancies 



Threats to External Validity 

a) Interaction of Selection and Treatment 

b) Interaction of Setting and Treatment 

c) Interaction of History and Treatment 



^Derived from Cook and Campbell, 1979. 
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